DE-83-03 
~ Development of Driver oe 
Education Evaluation Tests 
- Summary Report — 


Ontario 
Ministry of 


Transportation and 
Communications 


Transportation 
Technology and 
Energy Branch | 


DE~-83-03 


Development of Driver 
Education Evaluation Tests 
~ Summary Report — 


Ontario 
Ministry of 


Transportation and 
Communications 


Transportation 
Technology and 
Energy Branch 


Development of Driver 


Education Evaluation Tests 


- Summary Report - 


Principal Investigators 


G.R. Engel 

M. Townsend 

Engel & Townsend Consultants 
Toronto, Ontario 


Project Monitor 


L.V. Clifford 
Research Officer 
Human Factors Section, MTC 


Prepared for 
Safety Co-ordination & Development Office 
Transportation Regulation Branch 


Published by 

The Transportation Technology and Energy Branch 
Ontario Ministry of Transportation and Communications 
Hon. James W. Snow, Minister 

H.F. Gilbert, Deputy Minister 


Published without prejudice 

as to the application of the findings. 

Crown copyright reserved; however, this 
document may be reproduced for non-commercial 
purposes with attribution to the Ministry. 


This document does not necessarily 
represent the views and policies of 
the Ministry. 


For additional copies, write: 

The Editor, Technical Publications 

Ontario Ministry of Transportation and Communications 
1201 Wilson Avenue 

Downsview, Ontario 

Canada M3M 1J8 


November 1983 


ment 


DE-83-03 


me Sk 


ABSTRACT 


This summary report gives the general reader an overview 
of how a Driving Knowledge Test and a Driving Situations 
Test were developed. Two separate reports "Examiner's 
Manual - Driving Knowledge Test" DE-83-02, and "Examiner's 
Manual - Driving Situations Test" DE-83-03 contain all 

the technical details that are needed to understand how 
the two tests were constructed; how the tests are to be 
administered, scored, and interpreted; and how to evaluate 
the validities and reliabilities of the tests. 


The two tests were designed to be used in evaluating the 
Ontario Ministry of Transportation and Communications’ 
high school driver education program. One test is a test 
of knowledge about safe driving; the other is a test of 
an-individual's sensitivity to accident risks in different 
driving situations. 


Both tests meet standards of acceptable validity and 
reliability as defined by modern testing literature, 
and by industrial usage. The tests should be useful 
as intermediate instruments for evaluating driver 
education. 


- iii - 


ACKNOWLEDGEMENTS 


This project could not have been completed without the 
generous support and help of many individuals. The following 
people have our sincerest thanks. 


Barry Betzner (Toronto Transit Commission) ; Barry Bragg (Abt 
Associates) ; Ed Blake (Ministry of Transportation and Commun- 
ications); James Bookbinder (T.T.C.); ViCrOrssrragette.(T. fsCe): 
Ponda, Clit tonrd t(MAE.sC oa: R.G. Crothers (Alberta Transport) ; 
David Duncan (M.T.C.); Audrey Foden (M.T.C.); Ralph Gallienne 
(MAT ACs). > Bill wohnson s(Man.C.): Brian Jonah (Transport 
Canada) ; William Keen (M.T.C.); Al Manly (Thornlea Secondary 
School); Angus MacFarland (T.T.C./Amalogomated Transit Union); 
Gordon Nakashima (M.T.C.); Phil Randell (Driver Education 
Consultants/Donhead Secondary School) ; Barbara Rowe (York 

County Board of Education) ; ROVAIW. toemuckland o( 7 Gee 

Lakerem Sukhu (Ontario Motor League) ; Bale!) ethomsonta(MeteC.s):> 
Linda Tonelli (M.T.C.); Bill Towne (Donhead Secondary School); 
Paul Wake; $(M..T..C os, 


We would also like to thank all of the people who 
volunteered to act as subjects for this project: Toronto Transit 
Commission Drivers; Driver Education Students at Donhead, King 
City, Markham District, Sutton District, and Thornlea Secondary 


Schools; and many members of the general public. 


Digitized by the Internet Archive 
in 2024 with funding from 
University of Toronto 


https://archive.org/details/31/61118918986 


Sy ee eS 
SUMMARY 


The purpose of this project was to produce two tests to 
be used to evaluate the Ontario Ministry of Transportation and 
Communication's revised High School Driver Education programme. 
The resulting tests were: The Driving Knowledge Test which 
measures knowledge of safe driving, and The Driving Situations 
Test which measures risk-taking tendencies in different driving 
Situations. 


The specifications for both tests required; firstly 
that the tests' contents reflect experts' opinions about what 
is important to safe driving; and secondly, that the tests would 
be able to discriminate among drivers acknowledged to represent 
different levels of safe driving ability. To meet the first 
requirement, the tests were designed according to content 
criteria that were systematically derived from expert opinion. 
To meet the second requirement, the tests were validated, and 
crossvalidated, on three groups of drivers; full-time profes- 
sional fleet drivers, driver education students, and a group 
called nine-point drivers. The nine-point drivers were drivers 
who had accumulated nine or more demerit points on their 
driving records as a result of traffic violations. The valid- 
ities of the tests were measured by the degree to which scores 
on the tests would discriminate among drivers from these three 


groups. 


The resulting Driving Knowledge Test has two forms, each 
containing 60 multiple choice items. The items cover 21 knowledge 
areas defined by the experts as being relevant to evaluating 
driver education programmes. These areas range from defensive 
driving to traffic signs and laws. The test can be administered 
individually or to groups, and it takes about 30 minutes to 
complete. The maximum score is 60. Average professional 


drivers score about 50; average students score about 35; and 


api S 
rv cd 
- 
>i 

oe) 

A 
»@ 
' 
, 
= ’ 

- 

. 

eve 3 

5 we 
i 

ei is 
: 

sa>3 


s wae ) 4 . 
vane. sve cane sai Q a j / 
~ : ‘o @ te Ww se ~ 
£ ' , 4 
ivuw test sebes . 
N i 
a] - a i a hae es 
) ai > galvirnd ed O58-<t ae 
tn oe ™ ~ a* -«e' 
a+'=) — ts oe - - 
¢ 7 7 * a 1 ‘ 
Va te ae: Li ies 
7 mie 7 /* - { 
i 1c 2n om —e 
@ 4 § 
4 j 2 
Cow 7 we _- 
- le f > 
’ / 
a “«@ ¥ ; cr { 
ms = . .< ; 
~ - , 
on Cy 
~ 
“ i335 <> > > = — 
r - » 
t @ 4 
® L é 
e ry , * ‘ 
‘ = a 
b ‘ > 
»> o 
7 rc —— 
_ a’ — « I «a © : 
P| eter ror a | 3 
> ra . - 
s 4 ae ’ ww? ad hy é 
* ; ln co, 
oP ow oa - * ae = @ 
: ee - ; 
é ‘ Vs é PLA 12) e — =) 
4 
— ~ rr Pa 
r > -—n = a ; f 
¢ ,» BULA owe Rac 2 eas Wy 5 
’ 
¢ - ae ft ~ (y 
ts seyoo amez! acc 


oe igo. S24 


sjevleve Gd Jceveles grisd sé. 2 
patel sori epcs1 easns SngdT _- 
‘utes ed ces 2292, 005 o(awe dl ior 
* gegunie CE tue 

ignolLagselo tz oyes 

ee / 


«. Ae Pe =. 


» a oreo time 


id } ) e4t' 
23h ‘oO? 


Y2All etaeome 


og = | 
are J 
_ ii a) 
‘wat 1.00 
4 | ~ 
? + ad * 
LuMoMe Boe: on 


vile 


- 
ax cam: et .ss8tgn 
\Shecs i BRDRE, ‘n av iat 


average nine-point drivers score about 40. About 90% of the 
professionals score higher than about 90% of students. The 
validity of the test is about 0.75, for discriminating between 
professional and student drivers. The coefficient of internal 
consistency, or reliability, is about 0.85. 


Since validity coefficients were calculated for discrimin- 
ating among and between all three driver groups; for both Forms 
A and B; and for experimental and crossvalidation administrations 
of the test, approximate numbers are presented here to give 
the reader the flavour of the overall results. Complete details 
can be found in the body of the report. 


Reliability coefficients were calculated for both the 
experimental and crossvalidation administrations of the test and 
for both Form A and Form B. The reliability coefficient reported 
here is, once again, a summary statistic. All validity and 
reliability coefficients are significant at, at least the .0l 
level. 


The Driving Situations Test contains 40 items. Each 
item gives a scenario describing a driving situation. Each 
scenario contains three out of five possible dimensions 
describing; a driver's behaviour, a driver's physiological 
state, a weather condition, a condition of the vehicle, and 
a reason for driving. After reading each scenario the examinee 
rates how much trade-off the driver in the scenario is making 
between driving and the risk of an accident. The trade-off 


is rated on a seven point scale. 


The score on the test is the arithmetic average of all 
of the examinee's ratings. The validity of the test is in the 
range of 0.40 to 0.50 depending on whether the test is discrim-—- 
inating among professional and non-professional drivers, or 


between students and professional drivers. The reliability of 


a = -, = 
} 
: tn CY S irre! . p y tUOU2*2) = 


' — a al sn 4 
e sc oe 
- Ff val 
‘ ~ ‘ 
o > fy ~ 
> é 
+ - 
4 - 
y . 1 - 
- 4 4 
= 
a » 
rs & + H 
i, 
r + 
* ~ = y 
> ? a 
7 fs 
a v2 
' 
os - 
i . 
i 5 
. 
‘ 
» - « ‘ ‘ 
al 4 i S 
8 = — 7 * 
‘ 
oa aor 
S P 
i 
a - = ~~; 
Fs ; 19 rs * 
aw ¢ od, =i a : | 
—s> ele y a a * ote - 4 ok 4 
: 
ie «= * an i < ‘ “ne » - 
-Kis 2 a4 -e™”~ ary ae .> os aa’ {Le deely sp Sat SY 
. a 
7 { 4 > o> ar... / : ‘ 4 7; - 
=f oo eo hg ° S10 Mee Of pie (Pie 12405 


5c. VOhiide i987) eae | 1Siees cS Lenoresetoty Sn 


Pa WL eS 


the test is about 0.96. Once again these values are approx- 
imations and complete details can be found in the body of the 
report. All reliability and validity coefficients for this 
test are significant at, at least the .01 level. 


Scores on the tests are moderately related to each other 
such that the higher the knowledge score, the more likely is a 
driver to attribute risk to the various driving situations. 
Scores on the Driving Situations Test cannot be completely 
explained by driving knowledge however. That is, there appear 
to be genuine individual differences among drivers in their 
willingness to accept risks independent of knowledge about 


the risks. 


Recommendations 


The two tests cover both driving knowledge and attitude 
toward risk-taking in driving situations. Both tests meet 
standards of acceptable validity and reliability as defined by 
modern testing literature, and by industrial usage. By these 
criteria, the tests should be useful as intermediate instruments 
por evaluating driver education; either for evaluating indiv- 


idual students or for evaluating groups of students. 


Ee ORRU TOM SresGegiganl fe torssu ed El vore 


1 


iS timer, cho: 236 -b sud hash ft 
2 | abgyebisa=5 sselqmes bac. Ard 
wionles byt, vost elias SSA 3: 
@ Samet 34 30 gnsszhiingie ieee 
3030 dines od Sesaley YlesGseion a2 62202 462 46 B8% ml 
- ©s Qioitl ovom etz ,ei9ce spheiwork ots secgid 49) F6AT, 
weneiseu3ie vatvbab Seiuse Siz of dei> savel's 
(ieteiywes a Janae> Ses? sraiseusi®, pnivizi set aa es Xt ; 
16eqae sied? .ai 2enT iL abvevon Spielveos vaivesh ve Meet 
3J9A3 Gi Btevise Ohosie eeonsrei i ib Sl wien ee | 
fucta apbealwend Fo rnsbeeqaba! BASi2 Joss OF Be sce 


Peet i370 Got apbelwer?, onivirzh ttyed sasvoe eas 
297 E3209 MgO .anaiseuti2 enivish ni sniveg~ 
6 Gen.ie® ee ysibidsifes bie yetbilev «ldasgensR Jo 


Sees VE .epesu Ist s2eubAl.y0 Sré .suvicta*! 


“ViORl Oni teuteve ict usedaie saqcbésonb= s2v rib or 


.S3nehelt® 35 eouciv pnimsutéve 153 id _sigek 


= Walp. = 


CONTENTS 


Page 

DLO DU Lalas \ ei eo sEeT inet 8 (s0Eist et let begs) de eh 68 otiy et Nee 1 

DEVELOPMENT OBJEG TI VopLANDoLRATBGY: 2 (aus s ce («seen s 5 

CONSTRUCTION AND VALIDATION OF 

ricci ONG NOW Lisiste oo en elceg (cote Gre, ef es oe), a el te 8s 9 
Analysis of the Experimental Test Results .... i 
VaeditveOrytiuemKnowledgqe.- Test . ti wide. 4%. tre) oy) Speers 14 
RelgabiditveofetherKnowledge Test .m . . . . « > 20 

CONSTRUCTION AND VALIDATION OF 

Prine. we NGiWom ThA LOND holla te sree ocr 6s cls el ee 23 
Vata. cyeOnbe@tie DIAvIngeolLtuactzons LeStin, st. PY | 
Reliability of the Driving Situations Test ©. . . cial 
See e Lom OLe Or LVInGeracClorsn =< sn. % tA 4 te 2 8 a2 

RELATIONSHIP BETWEEN THE DRIVING 

KNOWLEDGE TEST AND THE DRIVING 

SLrUAL LON GeO tie) Sain § %.1-G fl GE A, ae hte a Oe Pe ee 33 

THOMA DM UNL SRA CLONGe OE HG Lao Lo «le eles) « 6) 6 s&s 32 


Peat oo Ne eee rate Lee <i cool es Ewil os) <a sjo os) © fe) Me 8% Se Sun 


Pars SINC etme, Wot sie} fee canis) ose 6) Fe Dan -« 


‘i 
aI 


want ietasa. 29428 ahs 1a sittd : A 
‘year ne TRswane, pas tor el é 
. ' yout opinixers ehs,.30 ec 


* 
e * * » * 
. «  ] 7° * « 


J 


egeT g@raisaus2S yo. fac sro Fe’! mie.) 
see? tnoiseoriSiepivin’ |55 °°. erent 


*» 


ar) re ® 4 


MC. MOR TACT GAY Me i) 
PEST St S73. YO 


90 PMeatIAt ita MOE 
e or ZAC TATE SulY 


* 


> 


e370" 1 * efi? iG to oe a7 


avilitac at tiesusss | 4- pene Th 
Suri gat WMA @2e" See 
; Tear 2ecry ities #: 9) 


sedaaey 30 smotramyetushdas eT 
ey 2a 
4 sea = 


- > 

7] 

: che | rsa oS 
eae ie P | 
a 


TABLES FOR THE DRIVING KNOWLEDGE TEST 


TABLES FOR THE DRIVING SITUATIONS TEST 


VI 
VII 


VIII Professional - Nine-point Validities 


IX 
X 


= Vliil — 


LIST OF TABLES 


Means and Standard Deviations 
Professional - Student Validities 
Professional - Nine-point Validities 


Professional - Nine-point plus Student Validities 


Split-half Reliabilities. 


Means and Standard Deviations 
Professional - Student Validities 


Professional - Nine-point plus Student Validities 


Split-half Reliabilities. 


s* 


es 


page 


16 
18 
sk) 
20 
Zo 


28 
Zo 
30 
30 
SL 


aff vk be andog-s 
‘ asp abLiey Spebure suiq 3004 >9 
sls 1) «384 on wstyt re 


nent orkut, ge cet, 304 


cia ono haloes ine easom 
ates abi sible? arate + Lenni saeos® 
_.., )@eigibriet Saisg-9a84 - fends eh>seam 
: reer jrebyse Puls sndog—anem - lenotesstevt 
So aelnal tien-s1i hqgl! 


INTRODUCTION 

This report describes the development of two tests, 
the Driving Knowledge Test and the Driving Situations Test. 
These tests were designed to be used in evaluating the Ontario 
Ministry of Transportation and Communication's revised High 
School Driver Education programme. One test is a test of 
knowledge about safe driving; the other is a test of an 
individual's sensitivity to accident risks in different driving 
Situations. As evaluation devices, the tests would be used to 
see whether or not graduates of the new driver education prog- 
ramme demonstrate greater knowledge of safe driving practices, 
and greater consciousness of driving risks, than graduates of 


the old programme. 


The development of the tests included producing a 


(1,2) The user manuals contain all 


user manual for each test. 
the technical details that would be needed to understand 

how the tests were constructed; how the tests are to be 
administered, scored and interpreted; and how to evaluate 

the validities and reliabilities of the tests. This report 
will not repeat all the details given in the manuals. Instead 
it will give the general reader an overview of how the tests 


were developed. 


Development of the tests formed part of a multiphase 
programme aimed at producing, and evaluating, a new driver 
education course for Ontario high schools. The first phase 
of the programme produced an evaluation strategy for eval- 
uating the new course. This included defining driver education 
effectiveness criteria, reviewing then existing evaluation 
instruments that could be used to evaluate a new driver education 
course, and designing specifications for new instruments 
where existing instruments would not satisfy particular effect- 
liveness criteria. The Driving Knowledge Test and the Driving 


Situations Test represent two of the new evaluation instruments 


ve 


4 

« 
s~ 
= * 
oe 
@ 


: 
+ 
; 
- 
' 
e4 


i 7 7 
ee: ove a oneeeleve = 426? S10gen eet 
owe Yo  orneatigels?s 3 q bey 
ts S cnbyés® er tae BceT so0el woos omReEy: 
a_i 
- -* 2 vw ad cs ' a £9200 e eo i 
»e- at mo. © 4 lie Fe ae ews 36 vate 1 m 
1 eee: aoele Le oe or 
3 \ act Sot : 
: - ‘ / , ? 1 TVU0OG6 aos , 


‘J urig 


ut 
—<_ 
‘ 
r 
— 
: 
4 
. 
be 
P 
t 


' : BI 4 » 
i369 18 
Ss 4 » 
- + 
‘ 
= 
| t j 
\ 
\ 
9 
+ 
’ 
- oO a 
j 
} 
' 
j 
P 
« 
| - Us sz - 
: r f i " ike 
‘ ad ; 
c “a A 
+ f 2 - i x - 
Io wen, a S ca os ee | a Y re OO } TE 
3 v } a ei > =a = > 
r ,« r co - F - a 
5S 4 < hb bh 
n - ¢ - ‘ 
iy bet i J » é > y A H / ya’ ys pe 74 


te 
= 

A 

+ 
ai 
€ 
= 
9 
d 
t 
L 


identified in the Phase 1 planning study. Details of the 
Phase 1 Study are described in a report entitled "Revision and 
Evaluation of Driver Education in Ontario, Phase 1: Development 


of an Evaluation Plan", §?) 


In parallel with the Phase 1 study, a Phase 2 study 
Produced curriculum specifications for the new course. The 
details of the Phase 2 study are reported in "Revision and 
Evaluation of Driver Education in Ontario, Phase 2: Prepar- 


(4) 


ationof aCurriculum Development Plan". The new driver 
education course specified by Phase 2 has been developed 
concurrently with, and independently of, the test development 
work that will be described here. A brief and readable overview 
of the work done in both Phases 1 and 2 will be found in 
"Revision and Evaluation of Driver Education in Ontario: 


Summary of the First Two Phases of the Study". '>) 


The Driving Knowledge Test and the Driving Situations 
Test have been developed in the context of a single project, 
uSing a common test development strategy to meet a common 
set of objectives. We will begin describing their development 
by describing these general objectives and the development 
strategy; then we will describe the construction and valid- 
ation of each test separately. Following that, we will 
discuss the roles of the two tests as partsof a test battery. 
To end the report, we will evaluate the strengths and limitations 
of the two tests relative to the objectives that they were 


intended to meet. 


j 


oes fe afieaet §«Sybus2  prinnslg,t jopedtt S22) a 
ine RaLegves” baiscaes dsoqey sat Pediioesp exe yhose 
! as 


> «pad Yo ae 
‘ @agoleved, i Gar: ,C2 sein? 14 is eae towed pel ison). 
‘ 4 : 
J ,ait, totaetiers ae 
‘ep 


s* 
se 


eigd= § Skee? «,.Yoore © gaane Sat. ize lei leiag 


ents BespO wen eas. 302 Sf Pew ae 4 yw 


! es ni 
Mia Mmoisaiven” nz GSesiIpes sia auc ASE" 
ves +k saens “t¥uen) ol f&oliameis” ives {osm 
Let es awd .<a8 


> hed ae 

» " r a ae ¢ - la 

~evlreah wo awet ne Agi i a ec | i iL he pise abe Po | eet Th | A 
- S d ~ “ o ond | ' . oy -- j 

j eecy bt © Aqears. SAL Sse rt ' pal [Hie le Sbryoce nae ven 


qremrolevel 2242 sc3 2.20 yisrseonegqacr, Sig «wiv (it qs. cee 


2 iy J 
on? af geben: (acs *siva & ) seis rth 48 See Litw shade oe 
= 7 * e - ; rea? ya yy ce | ae | 4S fal at ve Zee of a sw = - i6 


bagent, 4) Rod wesoDe waviad Do ros says oe hots see 


= 
hie 


: is ag i = hed 5 Ea e 
j SHUI 1s PEIVLAG) SAF ei o) S008. worms “pits 
*“SPO%c Sen. ¢ 90 Sxarriad. sti nu begelegst 
writ E Sat £5 < oO" . aeas, + if ver T2569 
"~urtg, , eV ot 4 | 5359 Abenypeh Ns *h- Zi 2¥ OW sas 
i= & aie! ae i eon @ene> 
tpemigolsysc 2 onF @¢ ars sO 7 SRP SSe 
- - ° . ' 
ii fed bee notselissenca regs. sdiivaeb Wale “s 
- 
6 “a : +9, . ' : . ™ 
Late (en, 9 FERS Bit wOss § \ Ssene> Taqg 
; ue io ~~ - a. 
.v 26S9R08 $2899 = S50 2G Vek Bae ov O42 Jo Balai 


- iit ? 7S if ‘ pet he an? 
4 ‘eo aud QWs a m* 
s REO) oO? ONEAS 


4) 
5 
Vv 
Le 
* 
a2 
ZA 
> 
4 
i 
7 
4 
ba 
ty 
a 
E 
«A 
v 
& 
* 
by 
- 
’ 
i 
. 

te 

fn z 

— 


- 


i 


ao 


DEVELOPMENT OBJECTIVES AND STRATEGY 


The Phase l study ‘?) recommended developing a driving 
knowledge test to assess specific knowledge areas defined 
by the study. The Phase 1 study also recommended developing 
the knowledge test by using the same method as had been 
previously used in developing a Transport Canada driving know- 


ledge test. '°) 


The Transport Canada test had been designed as 
a criterion valid test in which the criterion of driving know- 
ledge was how well an individual scored on the test relative 

to the scores of professional drivers from commercial vehicle 


fleets. 


Although items in the Transport Canada test had been 
chosen to represent a reasonably wide range of driving 
knowledge, they had not been chosen with predetermined knowledge 
areas in mind. Instead, items for the Transport Canada test 
were chosen primarily for their ability to discriminate 
between professional drivers and driver education students. 

This meant that items on the Transport Canada test, while dis- 
criminating between professionals and students, might not 
sample knowledge relevant to safe driving as defined in the 
Phase l study. 7 


The objective of developing a new knowledge test was 
then, to produce a test that would both cover specific areas 
of knowledge, and discriminate among different groups of 
drivers acknowledged to represent different levels of safe 


adrivingeabilities., 


In test construction theory, a test designed to cover 
predetermined areas of content is said to be a Criterion 
Referenced test. A Criterion Referenced test usually means 
that the content of the test has been defined by a panel of 


an Wiles 36 
tents «&.¥6- 0s 


~ dayne Seegetyzh omits Sosens>.> Lf pm oat ices Ys- 
> - ; 7 « 


= 

er a J 
~ a 
‘a 
/ a 
. 

— 

‘- 


ia 


be ; Ra ae 
= ie ore 7 


i - : 
| SORRR ra eT wer 
, 7 7 


Sys.’ (f¥ : 
fe _ a" oe * 7 7 = 1 
pargoLeved haebremoss ee oe 


eS 3% solisivant S22: aie 2e9rae we 


+ pen. 
i (TIF 5 a J Pays! s Lt ra + WSic | 3 tt 
oa f _ 
- < - out > oot “ ea “ ; i oe Bi Vv Pie 
» - 
o oo “ep PE. a o —\ 
a g 709g F's  BitaqulLaveo as + 
Pr. c 1. 2 @ ie a etl 4 at 
- q+ ‘ vw f oO 
7! t i ee i ba) tke ie 
. : 
4 4 ial hen os i rDrw -: f f 
= t- § a a 
=? “2 in 4 
fig Tat 3 ne >. 2°43 
* - 
, i #t= 
i 
7: : dri de { se 5 
7 2492" gall r 
> bist Be ; ~ t 
a | Fi j ate 7 i | 
_ 
f , = A ‘- ~ & ‘ ~ . - 
P 2 im 4 ~ 7 7 
4 e- borté 1c cS rc: 4 pi — a s2sanemgas 


: ® avo] Mfed blue +45) 3462 °° BORON? are 


y ‘oor * va 
sfeys! gnayetil erasaige: os bepSelWeetis S2BVaae 


er ol 
an | _2gbdR bbs onteist 


experts. And indeed, it was a panel of experts who defined 


the driving knowledge areas produced in the Phase 1 study. 


"A test that can discriminate among individuals represent- 
ing different levels of ability, or knowledge in this case, is 
Said to be a Norms Referenced test. The Transport Canada 
test was essentially a Norms Referenced test. Each item 
in this test was an item for which professional drivers got 
the correct answer substantially more often than driver ed- 


ucation students did. 


The main objective in developing the Driving Knowledge 
Test, and indeed the main challenge, was that of developing 
a test that would be both a Criterion Referenced test and 
a Norms Referenced test. The challenge in developing such 
a test lies in the fact that a test judged by experts to 
cover all essential areas of driving knowledge can still 
fail to discriminate among drivers representing different 
levels of safe driving ability. Conversely, a test that 
discriminates well among drivers of different ability can 
Still cover areas of knowledge that are not particularly 


relevant to safe driving. 


To give two examples that will help to give some insight 
into this problem: the most powerful item on the Transport 
Canada test for discriminating between professional drivers 
and students was an item that asked when one should renew 
his or her driver's licence. Virtually all professional 
drivers got the correct answer to this item, and only about 
half of the students got the correct answer (the students had 
never renewed their licences before). This was a good item 
from a Norms Referenced point of view, but a poor item from 
a safe driving or Criterion Referenced point of view. An 


example of just the opposite problem came up in the develop- 


; Hy5a74o 
r- j 
— . 
‘ 
=a 
T+ 


at 
vie 
“ 
j 
i oa 
* 
Ae a 
@f ‘ 
Ff * 
4 i 
wy & 
Li 
It) Ss 
a 
- 
mo 7 
i - . 
ik 
inp 


—_ i 
‘ 2/ w i é 
_ 1 
= 4 = ~ 
i . 4 
‘3 
4 
{ od oy | 
i 6 
ion 4S 
. wo gt 
avs a ret 
re] ta 
sae * 
“dl 
*) 
od 
i io | 
* 
s 
° 
' 
2 - 
' 
- 
, | 
2 i< 
27 7 
i J 
<5 
rere 
wae 6 
s =" ‘ 
4 = 
— a . ] 
‘| - 4 4 
- 
GW bs TF 
bed] 


i) seeeoe 


<— 
ae 


; > 
ety cl berpuorg = 
= 
we ° pacmo. oy er fi 


. vd 
: : 
ne ~— iT ~J2e2 os 
: rs 23 & ’ ‘ 6 2 ° =) 
’ ros toi Oy 
. rt ele » eyo vt 
a ti an | is 
4 Ts mo hate hy eee z* 
at 5 4 La ©) Lé 
lars = 7 ’ 5 | a 
Loe \ 
St 27°68 
Wor Tevez 
TAZ i i i 
‘ jeu. j aks ‘ 
| aa 7% 
4 ¥ a7 j ey 
“7 1. i eq aer 
ee | ’ - ’ — 
cn 
ne MmetL LwtITawor Cn no 3. of 
: 7 ré 
a eae 4 . 
ccoy nsewied pri mbiaag® Io} sess, chan 
; a 
| - 
aso aauw bates tedo mes: Bs Sew SINsouse 
‘ ; 
) ‘ * : : 7. 
Vv 41) InLV , aed tl é aHwiab va et: 
a : i | : 
maou 2c ‘at AIL ES +5933 one: JO" a 


320 singbuse ic 


23nd aris 


/f Fae n 
2007 fascias aepnen-t aaa, beaws 
7 a7 
=~ ir ic vw ; 30 goto bepnexetae! 


qp Sige welder 
. 


 . 


ment of the new Knowledge test. Experts considered knowledge 
of the effects of alcohol and drugs to be an important area 

for evaluating driver education programmes. However, it turned 
out to be next to impossible to find drug and alcohol items 
that everyone did not know the answers to. In other words, 
drivers at all levels seemed to be knowledgeable about drug 

and alcohol effects, and items on this topic are not very 
useful for discriminating among drivers at different levels. 
Items on this topic, while being good Criterion Referenced 


items, are poor Norms Referenced items. 


To put the two examples just given in perspective; both 
are rather extreme examples. As will be seen later on, the 
driving knowledge test that emerged from the present project 
did turn out to meet both the Criterion and the Norms Refer- 
enced objectives reasonably well. At the same time however, 
it is worth noting that in the past, tests of driving know- 
ledge, not to mention other driving abilities, have tended 
to be developed from a Criterion Referenced approach, and devel- 
Oping a test to meet both objectives was still a new challenge. 


Turning to the Driving Situations Test; the Phase l 
Study recommended developing a driving attitude test that 
would measure individuals' expectations of the probabilities 
of accidents under different driving circumstances. Implicit 
in the use of such a test as an evaluation device is the 
idea that a well trained driver should have higher, or at 
least more realistic expectations of the likelihoods of 
accidents than poorly trained or untrained drivers do. The 
specific approach recommended in the Phase 1 study suggested 
having persons taking the test, literally estimate accident 
probabilities under different conditions. The rationale for 
this approach was derived from studies of the effects of public 


information campaigns on drinking and driving. 


& 
mph word war 


; | 
banxa? 9 ate 
syed) tonetis bow 


: se: 
Pirie ely: “? 
wreh Jonge side daven ere? Ses 

ier don, ets srced Whibt: $cahe Sok 
G5 808! ony 3 4055.38 evaubse races ee Lae 3! 
bASAS wh) i MGLISs fy pian trey Fort aliiw sige 
teat . ees Sr “Lane 

aD ap7 4 , - i sci . 

ved is)iensusded! fl nevis jeul e#efemere ove’ arc the fh 


e - é - f f : — - : 
% ; tarb... rece og tiv . alge PPI SRA wea 14x 
sO LOT). Fess tq Sf: Hott &6e-b05 2a .. eb oT ell oun ondvh 
4 a 
—Jeteah seo of Pig xa 3 is) vod F 
e 7 r x 
b = = é _ ¢ _ : / * : st 
= 
- ~~ = a * ;« > * @ * r F a 
7 2 - + Ce ol s« : - & 
Wiesp* } VS ~- ae & ma Siz . 
-\'s E iad wiItGe SIO | t 4, 
*~ ‘ a. r D f > of a - w 74 i3 : ' l =e 
; i 
Se +-JaeT its po it a. ef 


Sa> 2g iss} riaP & pot evan bah oa 


barr 


a6 


The public information studies were ones done on advertising 
campaigns designed to increase drivers' subjective estimates 
of the likelihood of being stopped and arrested if they were 
driving under the influence of alcohol. Research showed that 
such campaigns increased drivers' subjective estimates of the 
probabilities of being caught for drinking and driving. 
Treating driver education as analogous with an advertising 
Campaign, it would be reasonable to expect a driver education 
course to increase subjective probability estimates not only 
about being caught for drinking and driving, but also for the 
outcomes of a variety of other risky driving situations. 

This was the approach set out as the terms of reference for 


developing the Driving Situations Test. 


Like the specifications for the Driving Knowledge Test 
the specifications for the Driving Situations Test also 
included particular areas to be addressed by the test. In 
this sense, the Driving Situations Test was to be designed 
as a Criterion Referenced Test. A Criterion Referenced attitude 
test is however, a potentially trivial test. It is reasonable 
to expect experts to be able to say with some authority what 
a driver does or does not need to know about driving, but it 
is probably not so easy for experts to say how drivers should 
respond when asked about ephemeral quantities like subjective 
probabilities. For this reason the Driving Situations Test 
was developed to be primarily a Norms Referenced test whose 
strength would be judged on its ability to discriminate among 


drivers representing different levels of safe driving. 


The development of the two tests was designed so that 
both tests could be treated as parts of a single test battery. 
This would make it possible to see whether or not the two 
tests measured different qualities related to the effectiveness 


of driver education as intended; and to see whether or not 


.orivvab. siuihace “pesana x5 208 bese based 30 vegpliz 

phintzasevbe ne.’ aS NGragoubs tenes 

cor okeebes 240248.‘ pn | ans 2ede: se hlpeat Ee |: 

vino 20% agcamises GE icadese » vitosteuz tsnerial o¢. 
end 40% GnLs Idd sedivs xb bye pebantss otf Fdyigs maybe 3 

ensisaus ie. pate toh (wets ett 36.381 a5¥ ¢ eo 2am 

703, SONS 30752, So Nas FAI as 3 ui 36) Son0sGee SAT Gav A 
test  evois ces? padvl 3d, ad? mu capod 


see? 26h%i0005, privind shit sot, eneitpustloeds. 442) elit 
cee J2ot enaisset ic) pnind sc on yo? sanisesg seq 

al #202. em va Danae zbup 26 Of Stays seluoigigg asbulgn 
wanpES ah bd on Se) bee? 2nOl seus a8 Saiviil odd, See ith 
stoslste hevhwagiet aoisagia® s that bsone ister ndiaips¢e9 A PA Q 
9, Ja (Den 71 ai ‘t TER fecvita wiisi sIheteac — ,?sveron @2 78 ¢ 
FETE YP i 7onsic SN0k 142 Ve -O9 SL6% 9 OS BS 20G49 299gKe™ 
tie wondwext J00%% Yous =; base Jon asot sd Baob sous 
inf tepahas | etaviar wort Tae 93 = dyesKe 703° VY ees oe, OI) vitedoag wm 
“Val>ylous jor ls e034. Sneig LavenSrige food: balae cody Bnee 
i255 aiaisenase pajvisd. ea: aeke >) Boel et weeny doig 
g2 Siw S94 pmSae stan, mind oi é gl aes ty, os huqel evet Bw 
ha pe os Wi fice one Hepbut so Bilvew ‘iipne ja 

3 ' ou 
dordvs a pia se haul ye weds 46 202 1 feRP tes e739) * 


earn 


i - 


4 


; ats ve ee 
oo 


fas Bo An eH eheyas ee 
oa ae me Bien eee t 


the two tests as a battery would form a more powerful evaluation 


instrument than either test alone. 


Initially, two groups of subjects were identified for 
the purposes of Norms Referenced validation; professional 
drivers, and driver education students about to graduate from 
the present Ontario high school driver education course. 
The professional drivers were full-time drivers from a com- 
mercial fleet. The professionals were defined as the norm 
for safe driving. Both tests were then to be constructed 
to discriminate between the professionals and the students. 
One would then expect graduates of an improved driver education 
course to get test scores closer to those of professional 


drivers than graduates from the old course did. 


In addition to the student and professional groups, a 
third group of drivers was included in the validation. These 
were drivers who had accumulated nine or more demerit points 
on their driving records as a result of traffic violations. 
These drivers were tentatively identified as "bad" drivers in 


contrast to the professionals who represented "good" drivers. 


To meet a final development objective, the development 
plans included a crossvalidation phase in which the tests 
were given to a new and independent sample of drivers after 


they had been validated on the original sample of drivers. 


The purpose of crossvalidation is to ensure that the 
validity of a test is not a chance occurance, and that it will 
remain stable when new groups of people take the test. 
Crossvalidation, or even validation for that matter, has not 
been a widely applied procedure in developing adnivingmcests. 
Nevertheless, there are some notorious examples of tests in 
other fields that initially appeared to be quite powerful tests, 


but which turned out to be almost useless when applied to new 


Pe | & mn 7 * 


ip 


“he 4 Ps 75 [= v ‘Ss E aie tral & abil awe 
<= 
sj pc) S230) i vast ‘Se ‘ éesiv 6 


Ee rary .peeledsseV 
1 menage. (hae bt be tart wait ot as 


aawh, 


run 
"leh 
Pw 
e 


Ra 


groups of test takers. Because of this possibility, crossvalid- 
ation was treated as an integral part of developing the present 


tests. 


The objectives set out in developing the Driving Knowledge 
Test and the Driving Situations Test can be briefly summarized 


as follows: 


é The resulting tests should be both Criterion Ref- 
erenced (cover areas defined by experts) and Norms 
Referenced (discriminate among drivers known to 
represent different levels of driving ability); 


: Each test should measure demonstrably different 
qualities related to safe driving; 


The Norms Referenced validities of the tests should 
meet the test of independent crossvalidation. 


Having set out the background and objectives of the 
development work, we will now describe the construction and 
validation of each test in turn; beginning with the Driving 


Knowledge Test. 


- 


sgbasivond pailyviagd > posiaeavne ae 205) tee ao tee ll eT 
bop) aeetive Vise: td ye Tres _ eombsnuTie pit d ban: oat haa : 


—_ iva 


~het febtaré > ate 4 ed (Sy cle ' ay gies ee ceves 2 agt a) oy 
aricB Die letsagrs @0 tehiteb Gesu s8vae! Teoieam Fe ~ 
Of GhcoN etovasK Ohetin avanind seid) Hon ceded Clu 
s(ethiide onlgive de areal IagWetiye a dea , -" 


saghati ch \ldnwietateh e:uesem Gilode Sear 2680 ° 
he” pes i> 14402] cS Bega! st eeuvrieup 


Rivet= esse? #42, Do deietbilev-bestacstal. enrol eAt | % 
ee 


sblfrevetoie Insbieaslalr 3° 7257 st: Jeem 


g#c/ 30 2eyicterdo SAG SEUSS ONGAG eas Sut ¢ ee fniven 
Be MIL Sowers ens Suiesseh ect DfEw aw |. how Smnmgual 
mie saC oft tei, ertnhiowd Ais di-teeh S248 Se not veShd 7 


oy 


.WeeT aphelwoay 


me 


CONSTRUCTION AND VALIDATION OF 
THE DRIVING KNOWLEDGE TEST 


The experts in the Phase I Study defined the following 
areas of knowledge as important and relevant to evaluating a 


driver education course: 


- alcohol and drug effects 
fF ane vingron curves 

. defensive driving 

- emergency procedures 

. lane changing 

. hazard detection 

. highway/freeway driving 


. intersections 

. limited visibility/night driving 
. Merging 

- passing 

. pedestrians 


- YXight-of-way 

. road conditions 

. seat belts 

Sh CeCOnCLOL 

. stopping 

. surveillance 

. traffic signs and-laws 
2 PRCUTIPLng 

mele banned? LVing 


We began the test construction process with the intention 
of emerging with a test containing two parallel forms containing 
50 to 60 items each. This was to be achieved by constructing 
a provisional test containing about twice this many items; 
administering the provisional test to a sample of experimental 
subjects; and then keeping the best items from the provisional 


test for the final version of the test. 


To develop the provisional test we drew on a pool of 


1,313 items developed by the Highway Safety Research Institute 


(7) 


at the University of Michigan. This pool was culled for 


items that would fit the knowledge areas defined by the Phase l 


= ae ‘ ae - 
af3 T—™ pec 2cS0 635 


7 
an) ® ~ WOLTAIT BAY: aan AOD TNE 
= en - 7 ae oT 
eT ofa , — 
7) ment SOG 
= ” en 
\ 
- ¢ ; ~)-t. - @ «at. ' 
gis ; epult< 5a ds: 0 
eve os 4 51 (snes 7 . la = 
- Pa 
e-f- ee 7 
- aa ; ed lin 
$ in OS 
’ :>s 
* Lat oe hi (is 
- ~ ae 
a 3 
aint 
' 
C \ sy F 
— 
¢q v\ f 
mae ~_s a 
7 
cy 7 
" —j 
c, o> } 
7 
a ' | 
dj ye : = oF co. Porta ng 4 . 
. <- 
r & - 
— 
7 j 7 ' 
é« 7“ 7 


a 

nr 
wo 
t—s 
j 

=) 
ia 
‘ 


tae t no ain i Sc | = 7 

$800 end ic foie. sy le 
3 Ro weybpew Fee? -lancse vo aa? Sibert: oT. 
eszen 24 bés) ¥ piv ees ya apretateE: @nezi « 


wir +e 


isl sah 8 ine oh cil ? ds 
mh - : (er —_ 
sib Risen te  emers 


7 


< Aw : * 
eu, Loog * wer gehen 
- aaa 


: agperveny \ : 


athe = 


experts. Some items were taken and put into the provisional 
test without modification; others were modified to suit the 
purposes of the present test; and where no suitable items could 
be found from the Michigan pool, new items were written. This 


Process yielded 255 items. 


As a formal step in producing a Criterion Referenced 
test the 255 items were submitted to a panel of experts who 
were asked to perform a procedure called the angofé '®) procedure. 
In the present context, the Angoff procedure consisted of each 
expert going through the provisional test and for each item, 
estimating the percentage of minimally qualified driver education 
graduates who should be able to get the correct answer to the 


item. 


At the same time as they were performing the Angoff 
procedure, the experts were asked to use a five point scale to 
rate how relevant each item was to safe driving. They were asked 
to treat the number "1" on the scale as meaning "not relevant 
at all to safe driving" and the number "5" on the scale as 
meaning "very relevant to safe driving". Of the original panel of 
10 experts, six completed the task. Three of these were exper- 
ienced classroom instructors, two were traffic safety research 


professionals, and one was a senior driver education administrator. 


The results of the Angoff procedure were used to evaluate 
whether or not each item addressed an area that a driver should 
know, and to evaluate each item's difficulty relative to what 
a driver could be expected to know. The experts' relevance 
ratings were used to supplement these evaluations. From the 
Angoff results and the relevance ratings, some of the prov- 
isional items were eliminated, and a few more were modified 
according to suggestions made by individual experts. The end 
result of this was a pool of 244 items which we divided into 


two roughly parallel tests of 122 items each. We will call 


 o 
4m wok saat2 faa CAME Sg 710 .* 


bives sired ofGe oa bine a ape oe anteaie ada’ Pe 
pataiow e1sw emery | wip. LORS ABEL. tera 8 ars meas 


= PI i ; 
1, _ ean £ ccs bob hady y 
se-noveteadt qoiuagisD 6x SLSUtors di as7e Cepiol was 
ety »IIBQRS - Lenaq 4 ‘Os bettigeus ere 2 meth @e% 
»srebeun ts et te eri 3 i es Waubetove, & Madtea og 
3 so Be*risnon swabedoxo YGeprA ate | sMS7NC 
nes) dost yo? Gee gee Lenoteivoig gio! dyues 
iienwt= vevinb Satakiavp vilaminim 26 opkIng55°¢ 
efi2 oJ  O8Wware J99TTOOo Shs - » os sfids sc Bluons crw 2m ude 
ly 
*?oonA pAimeoring s14Ww \YsGs 25 amis A 
sf elaye 34104 Svis)t : OF we jw 32 xq 249 
Boasc sys * .onkvir1h sise.of saw mec S- 30606 
susvelé. stor” erninesm ef “sleo2 9% o f° 2gemie’s 
> sieos Sit ne: "tf". sedmin: 5 hia “pricesb eine 
1 ae Lea 21 943-30 “eesyviah stee Of AASVS. es, YIe 
a .eene ezew we@e? RolsexdT .ARel Anlp- Paral yemog Ae" Le 
oyeses) Fister ccna Bsow- cHS ,EYOvoUTTES.- noos sae! 


» 7 7 7 
fesainss nolpennbe yavii1l apinse «© e6w snc res elandlaesic ; 


recieve of hetyoedew SyvBadeIrc JioprA sd3 30 ed tues aay - 
a 7 
‘ 


thé Pegasaubs “goes. - = eet) zi ans 


tate ots Hi teks Yiu PSif. 2 marl Aose p Fone of hee 
: - 7 — _ 
Fr - _ ee val ~*~ 
sshavod ot! sisixe a Phd a oe von 7 
in sal tl | ius 
ca mass 2053 Soubevs  drrontat a rie Gs aay 2a # 4Z 


> 


7 — 


o *OOR" soneveiss aA ie 
. }tibee adew_e? ne. stasenk Me ein 
» a. & 7 19 <~ on 7 so @ 
”Y 7 


ait 


these tests, Form A and Form B of the experimental test. 


The experimental test was administered to three groups 
of subjects: Professional drivers, Nine-point drivers, and 
Driver Education Students. The professional drivers were 
full-time fleet drivers from an organization whose drivers 
must meet and maintain high standards of proficiency and safe 
driving. The nine-point drivers were drivers who had just 
completed an interview with a Driver Improvement Counsellor 
at one of the Ontario Ministry of Transportation and Commun- 
ications Driver Control Centres. These were individuals who 
had accumulated nine or more demerit points on their driving 


records as a result of traffic violations. 


The students in the sample were from a high school 
driver education course sponsored jointly by the Ontario 
Ministry of Education and the Ministry of Transportation and 
Communications. The course was taught in the school year 1981-82 
and was based on the text "Power Under Control". This course 
is the standard one for Ontario high schools, and it is comparable 
with high school driver education courses given throughout 
Canada and the United States. 


For purposes of validation, the students took the 
experimental test on either the last, or last but one class 
of their course. In addition, the students took one form of 
the test, either Form A or Form B, on the first day of their 


course. 


The experimental sample consisted of 150 professionals, 
150 nine-point drivers, and 150 students. The professionals 
were drawn at random from an employee roster. The nine-point 
drivers were asked to volunteer to take the test at the time 
they completed an interview with a Counsellor. The students 


came from classes whose instructors volunteered to cooperate 


: 
eo 


77 _ : 
| <l ® oe 
vi. we Coa ah oT 
see) dears elds i te ers cat 
; " 
pyeId, Asie adi wes oe ro Ape’ fai age 
\ —— Ra pe 2 J ine 
eo rot SNS af Ores ) GIG eael25 
i v 7 
; lanole@ssini2o => .& sabose nore so 
<a ” a 4 P ‘a 
| rat jw noses JNApac cx moTY eisvich 70689 
® Lh <2 20 Shixskvir a apie niet mm hed 26 
A vaeree re. 4 eP1c = +r | tn a = wee ant? 
Joyal sera. | i wor nf we Ded 
<s 6 mucaT “te iaAl™% or a) ashe Ba 
[Aa ey 4 b ” | 7 
€ & 4 oj} 2 12g i j =< ae Fs 2 ax pete s oer =. J 
: i ; ~ S F ‘ sete ss ‘ y 
at 2 4) re a a 7 ——e fh | Pt] a0 ai) es : ° rR ns ¥ hn 
arit 2%e945 %o Siwret © S68 68) 
} .] 
- M - 
: cr sk pin ) aJi@nmear ens rs « S372 ett 
J & a2 
| ‘ : 1." : Losi ots = ca “4 “Gn Saupe 9eovic 
p-4 & es 7 ; a fh ‘ Ii Che ; , i¢yv oa lo. vise at i 
~ - o- ‘ar 2S ei We 4~—, P Yano i sf ws or mor 
‘ = bin ii. - ofa.) | =< ~e ~~ © . ; tee? Pn. 
ee =m § saw Dir 
"Sea snNe! SS 4 ’ ? as » keesd few ne 
Z 
i ‘ s J at bier /® ei) 
vt Tie . ; . 7 
oe i MV BSSat @.J7& 0068 saygivh Lopdst ngia eer 
Per eto. 26 st? @nhy One shes 
i 
é ri pe itshiley Jo eeaeysng tot 
- - - i H ‘as! rr ~“ if : j Say * 42 nema se 
: io : 2 mt iegF felt io0nR Ff SS al 
4 wel 


- dignabewedos bel ral bemipenee eign ted reiteaeee ont’ 
vi 7) ; 
ax? Stymebese MD rt EY iyi optus min. 08. 
Loo-9n aft wweteos eoyolans rsa aot: va se +i 
7 =, ¥ 7 


bh 


é e - 
axes) D 


a ee 


in this project. The professionals and nine-point drivers were 
paid for writing the tests. The students were paid depending on 
whether or not local school board policy permitted payment. 


Analysis of the 
Experimental Test Results 


The purpose of this analysis was to identify the items 
that would go into the final version of the test. The analysis 
was done by a process called item analysis. An item analysis 
consists of finding each item's validity and reliability. For 
present purposes, an item would be valid to the extent that 
professional drivers tended to get the correct answer more 
often than either students or nine-point drivers did. The 
reliability of an item amounted to the extent to which getting 
the correct answer on an item corresponded to an individual 
getting a high total score on the test. Constructing the final 
version of the test consisted of retaining those items that had 
high validities and acceptable levels of reliability. 


There is a certain amount of judgement involved in 
choosing items in this way. The relationship between item 
validity and item reliability is a complex one. Demanding 
that all items have high item reliability ( that is, high 
correlations with the total test score) will generally yield 
items all of which measure a single factor or narrow range of 
ability. Restricting the range of ability measured by a test 
typically reduces the test's validity. For example, a test 
that samples a wide range of driving knowledge is likely to 
be better at disciminating among groups of drivers (hence 
have high validity), than a test that only covers a narrow range 
of driving knowledge. Therefore, items were chosen which have 


high validity and moderate but not too high reliability. 


In addition to choosing items according to the item analysis, 


the selected items were also checked against the experts' Angoff 


; ~ re ed ee . : 
3Se Baers. oo aiknoteeetosg sat" % : 
an eniinegst' beeq S954 esnabuse: * | 

. mesg. bassamasy Yas iog, 


mati ody viishees of Ash's cents ais 20 wpeqyg ett 
Yeats otf tea? adtete nolasev Benet err. want oF bivow 4 
risylene modi AA SE RELENS moti holin> erenclg & Ye woh oa is 
tas oN Eds ties ore. vs Lik fev Stina i deeb ari tele a¢ ae 
cad Snetne oo? 02 biléey of blinw oas! ne VemeReS ge 
307 t4w@aé I79e305. Sitges oF a asev inh Leonie 

/ eff .6)6 ervevial golce-snin yO etagpute ieritia neds As 
nnissep d¢oinw co sogage e4 so becipamt messi. ve Qa WEES IGA 
L6u5ivieni ca ot Saeheegsgeiioy eel as ch  svaene CSs7T3e0' 
(a7 ene phi souatenct 1253 80S Po Stene e100 dot « erred 
bhet taste emest saad oni nieeey 20 Sodatence teet. gas’ te aokay 
“tifidsties Jovakeval sfustyecss bas eelgtttiov dee 


mk peWiavn. Jnemenbel Zo mine tee ven & bi evedT . - 
eof Weewled qidenoldti.e: aiT . Yow ebnd ms sited) pelea 
pel natal sto xeigges s ei yiltbdstife: megé One yath t 
feed ,t4 sec ) warlideiie:' med! Apis Sree meet) Ife 32a 

hiely Vilisterer fiw (qt0¢e Jest Jato? “ny hay anoiga 2 
if epf8? woIIEr 26, 7699462 dionle A SThsee7F di § aye Jo Lin. 
1es3 £ yo Deogpeeum yoLitde Fo Sonat ita Sfisyryeaea ' 
250 £ \eiqmexs 30% .ygitifay e'sess she xesibea yly 
oe “lowe | wa y2eees woman’ mety tab 36 ‘span! a6aW 6 — 
sumer) snes ie $4 equaae Peete oa tei . 
oPNeT KyiTser £4 g719h—> >) gto qedh s2e4 6 re) Ys 
eee Molde nesutts erow mp3) (PEUTOTSET A iota 


@3- tidgiles test. ont san! Yue mi erring’ 


yrs ® 
est 


«Bl e¥lene. 2524 ol a 1 sha 
bad iota ne 


Yiouns "g? tonne bts. ae 
me 


Lo i 1, 


mas ee es 


and relevance ratings. By and large items selected on the basis 
of the item analysis were confirmed by the experts' opinions. 
However, a few items, on alcohol and seat belts for example, 
were retained even though they were not particularly valid for 
discriminating among the driver groups. The items were left in 
because these topics were considered important for evaluating 


driver education by the experts in the Phase 1 Study. 


In doing the item analysis, we also paid attention to the 
potential problem of differential Valvduty.-e Dt! tsrconceivable 
that an item might, for example, discriminate well between pro- 
fessionals and students but not between professionals and nine- 
point drivers. Similarly, another item might discriminate well 
between professionals and nine-point drivers but not between 
professionals and students. As Temeurnedsouc, sfOu avintually tevery 
item professionals most often got the correct answer, followed 
by nine-point drivers, followed by students. Indeed, the number 
of nine-point drivers getting the correct answer to an item 
consistently fell about mid-way between the numbers of profession- 
als and students getting the correct answer. Consequently, dif- 


ferential validity was not a concern in selecting items. 


In a test like this one, the overall validity and rel- 
iability of the test is a function of the validities and 
reliabilities of its individual items. Thus, the process 
of disgarding items from the experimental test included cal- 
culating the overall validity and reliability ofsdarfferent 
possible versions of a final test. This amounted to a fine- 
tuning process to maximize the validity and reliability of 
the test as a whole, and to make the validity and reliability 


of each test form as nearly equal as possible. 


The item analysis yielded a final version of the test 
consisting of two forms, Form A and Form B, containing 60 
items each. Both forms of the final version of the test were 


then administered to three new groups of drivers; 7 Seeroress=— 


fis 
be v* 
[te 


Ta 
is plasivotisseg toa ‘ezbie si> eal bi Be 


ioe west 8 5: Br coey ne 


if P. 
339 bad | 47. er7 * £e wy seer Si acts EVTOMe yale aorel 
Pa on 5. = > See 
10% dnbtangml bers. ened sage entice ved? otviad 
vouse { a¥—de Ga) yas ‘=Saque slit Vo nobsanuhe . rows 
o 
riod ney. Bele « ptaviene majyi snc tno 


i 
. 
- 
> 
= 
~ 
” 
’ 
Fs. 
a 
‘ 
im 
_— 
— 
= 
* 
— 
~ 
> 
. 
a 


= .coes iy ACerS : IJuC S20eabule bn ale , 
tih Jdetm mei: aevidone (yt saqctials Jexseulsb 9 af 
ami 7 . i Sastre! = 7 rw ORE pt uy haa Jen ¢ oe 
S28 | a i 4% +* 6.) ae ,einebuce bre o@- wrol eee SF 4 
\ 7 = 
a | 5 wi 1923 1 3 j 
| nib io? 42 
© =") en . eS =e Ey T 
Citi af . E Piy : a Uie2 
r oc) 4 3 25 7o59 esas ?:onm 22684 


wif onidgoesl sz i ‘ioc & Tor BE VA (EL ey Lalae 4 

| E . . i 

if ya ih) zy Letev> git no eft) eFis $9a2 o 2 
$lJ.Gliisy 373 26 Noises Brel 2Se2 oA2 BE “eiia 

- hy een meats |. Ao eitalves! Be eeisils task& 


soo fs iat, Gey: 5 pn} bs eps 
saatibeYo wthide yaatiley ‘Pasyep aap wm iad 
Eon OD 2 bony : G es phe is 


is onva me sl 


yee 


ionals, 75 Nine-point drivers, and 75 Students. This admin- 
istration constituted a crossvalidation to see whether or not 
the test's validity estimated from the experimental sample 


would be confirmed by the crossvalidation sample. 


Validityeolsthe Knowledge Test 


In this section we will describe the major results of 
the validation and crossvalidation phases of the test develop- 
ment. Readers who want to know these results in more detail 
should look at the User's Manual for the test. We will begin 
be describing the results relevant to the Criterion Referenced 
validity of the test; these are the experts' relevance ratings 
and the Angoff ratings. 


Over 95% of the items on the final version of the test 
have an average relevance rating of three or greater on the 
five point scale. There is no item that any single expert rated 
as not relevant. We also calculated the correlation between each 
expert's relevance ratings and every other expert's relevance 
ratings. These correlations varied from around 0 to just over 
0.50. Thus while the experts generally agreed that all of the 
items were relevant to safe driving (scores on the five point 
scale), they did not show strong agreement with each other on 
exactly how relevant any one individual item might be; as indicated 


by the low to moderate inter-item correlations. 


Correlations among the experts' Angoff ratings ranged from 
eee con Odor e014 lento, 0. yest -Oneapanticular experts ratings 
were ignored). These correlations represent moderate but not 
striking agreement about the level of difficulty that any part- 
icular item on the test should represent to a minimally qual- 


ified driver education graduate. 


We also calculated correlations between the experts’ 


Angoff ratings and the percentages of correct responses actually 


te ‘a e 


« 
— 
2 


: a 
mere be famnisek 17 meyer 3 
—— © Sf2, cd prone! 223709. ve 


\ -? i 15 ‘ fy 4 on iso ‘ 
- 3m * et “4 é ‘ . Fe 
i - . i a0 LP e ~~ te ~~ © 
‘- ap UP sa oe 
P| “4 , a <= ZeeaT 7) f - = . P| O i np oA ¢ 1, 
‘at a 
a 112 ‘ * fee yo "3 J od 4 a tej —_ ie del 7 
a + © — & -— = e 
* ‘ re a pe 
7 , r a & } Sis J - Tt cA - ge ‘ ~e ad Oe 
é - -_ - = @ 
f: > ‘ 
-« ' 7 7 14 & 7 id  % J pe o- p ibe 
‘ ¢ - d he! ¥ s25 } 
. =a] 1 i “~ « 12 y os ad t ; ri & iH 
y LE; PAT dead 3 ° 
Abe 
:gh2 dig te ReLe tee fenld sh ane?! ser9 
- a 
‘ 
nm 4 wea — ~ 
i : ; 4 — 1i 3 ALIFE u ae | > 74 were rr ci. & 
ia 1) ¢ : = © 
¢ ae : <2 iJ a2 i i OP obs iTS et 
’ 
. _ 19%. Al Z3o> Sh2° Pass Sao. Geis eo , ev se $1 
« ¢ —_ —= _ ball = if 
; = py oe ae 
7 1 {otk ante ‘3 ty, br 1 yt é 
( 
* a ri 
2 ; C cc ~7 ft) a | i +o 1 f) * 7¢ i &<> ss 
J 
; . 
1 tea OBSibe (28S32NS0 e7isqe> @ is el ifaw. ean 
‘ avisrat> Sae00R) .privise Ose | oe sary 
= aor" at | gua 4 = ry +d 
= 
® - j 
: DM os -, & ¢ > gi _) 7 r 7] 4 
5 eric Lie “jew. Jnewmersps, Sages s wey IF ‘pAb Yun? 
ot 6 ni » Ae t ioe r 
=“4ealer) « (oc Sag rm  aes2 = Sua4* iar eno ya "hr @) S nf ¥ ‘dig 
P L& = Rd , 


Si 


aa 


-2paLTsLe sic tats~se- 434 eis oe wo: 
— 


" 1 : : a ie 


53 pone ep axe eft 


=~ ae 


obtained by the experimental driver groups. The correlation 
between the experts' Angoff ratings and drivers' scores on 
individual items was about 0.45 for Form B and about 0.75 for 
Form A. From these correlations, the experts were moderately 
to fairly good at predicting the actual levels of difficulties 
found for individual items. We have no explanation for why 
they were better at predicting the levels for Form A than for 
Form B. However, there was no attempt to construct the two 
forms to be equivalent in terms of experts' Angoff ratings for 
individual items. 


If the experts' Angoff ratings are averaged over the indiv- 
idual items on a test, one can get what amounts to the experts' 
Opinion about an acceptable passing score on the test. Accord- 
ing to this calculation, an acceptable passing score on Form A 
would be 42.2 (out of 60), and on Form B, it would be 41.0. 
Despite the experts' variability in rating individual items, 
the average passing scores are quite similar for the two forms 
of the test. Comparing this calculation of an acceptable passing 
score with the results actually obtained by drivers in different 
experimental groups; 98% of the professionals would have passed 
chestest; about 50% of the nine-point drivers would have passed 
the test; and only about 20% of the students would have passed 
the test. 


To summarize the Angoff results; we believe it is fair 
to say that the test can be treated as a respectable Criterion 
Referenced test. Considerable care was taken to ensure that 
items on the test covered all of the areas identified as being 
important in Phase 1. And although the experts varied in their 
detailed opinions about individual test items, their consensus 
was that all of the items represent relevant knowledge that a 


qualified driver should know. 


or 
40) @waene4Psaviyw’ -tine 


~ 2 ! 
av T ACLyers 
~ 
™ —s — 
7 - J ‘ Loe ae - 
ry 
° | | ” so Rady Cc; 
be oo TH anise 4 
74-2 of 3é Figen 
i 
” 2 ¢ Sean 4 
am 1 7 >: 
>i 1. A oa, f" a 
? 
™<~, 1S sv a a 1 
i a 
7 4 i € ’ Tem te 
i] > 
‘ eb > ~ tray : a 
cay 5 j 7 
‘ a . “ 


Siceedg af : 
7 _ a a 
in t%rh Sev : 
, 
my EC : : ' E i’ — = 
{ 
1 + 0 a , we sa? it a 
Cs 4 jen Bivow.acn4ouse 
sie? SI 32. @yeiisd gyre 
1, 5> e+e ~ > ir 
stl» ps rid. 9,6 : ay = 4 oa 
FHT BItvecd c2 445. SEW 
ee : 
pate ge On, 9604-2 pe 
al bd o~ 
prads amt beisav e5cenne an 
SyBneenod ties. ,amesil JAS 
o 
: wpisets sebalwows. 2% 


vagiinin aay 


tne 2 MOR oh 


vat ee eee 


un 


iv 


(Tenbs, 


_ 


~_ 


tie be 


U- 4dbac 


é “| i 
> 
: nd 
7" ie wc 4s ge a] se , 
. 
f - I \ 
F } #7) ee a ° 225 q a 
6i o) 2910S. BIL iW. 


P| 
ari hb +? f } . 
at 4 >? pads at ts ” 
ay & ) r ¢ 
Side rd Fane +> ang 


Tepes A "et eared 


eee ‘smart “4 Ae titel 


42 a a 


sien: £tas’. coe ’ 
°.6F ed eo 


nina ‘Sa7eagNw «192 eo 


(24UCIp: “sapees 


$04 3v00% e@¢ in, 
"i a 


ghdd« vino Sas, yrob! 
eb! 
' a 


L 


; lon 
220 onA =! s so kyaond? og 
g ‘y- 


acc ane Sens ¥ a 


Sande nig Sit 


jaircet | tga. it, ame 
SGT Ai Sam a 


- 
tf 


Kiva. ° I 


~~ Ne 


eet 6 


An evaluation of the Norms Referenced validity of the 
test was made from an analysis of data obtained in the exper- 
imental and crossvalidation administrations of the test. We will 
describe the evidence for validity first in terms of the mean 
scores obtained by the different driver groups, and then in 
terms of the validity coefficients describing how well scores 


on the test discriminated among the different drivers. 


Table I shows the means and standard deviations of the 
scores obtained on each form of the test by the professionals, 
the nine-point drivers, and the students. The means and standard 
deviations shown in Table I represent results from the experimental 
and crossvalidation administrations of the test combined together. 
Data from the two administrations were combined because each 
group's mean and standard deviation was almost exactly the same 


on both administrations. 


TABLE I 


Means and Standard Deviations 


Form A Form B 

Professional 

Mean 48.53 49.42 

= BS 3750 AS26 

N 225 225 
Nine-point 

Mean 42.58 43.50 

SsDe 6.85 7.48 

N 225 225 
Student 

Mean somos 34225 

SD eBant Me Boas L 

N 225 225 


aneis Essel petit Me 


naee oft Fe curves mt teas 2 Cer efep ‘ers — > a 


gait Sme..2qorg i daevataite pis vo 8 Serntk add 
ee vs " 
aksove Liaw won t Teh i oe i pneaait as qsebis 3 


_e@yeyvisao tines ba 208 aroma badant nh Tt 


ert , 5 
+ Vo. enoltjac Pot bile ds bre etiam sit ew 
ie ‘a j 5 ; > df : 
antaeetom sds 570 gees-eie To MiG2 foes 
OT lies 7 : 7 
heatingte bac snepa Sri!  .27nebuse ens Lhe {exsvia’ 3 hie 
iy, 


terne@iiavers e427 most, s3iveas Ingeoiwgs 4° aldet: te nwo e les 
: P 
ao i> | wos F2SF sc tO BHOOIGE wel bne mo! abilavact 


faae saldeved benrdmns wre", shorts otlekndinbe pun a; ay mood 


: af’ vwi-sexs, seats waw nerce. vee bisanssa ‘bine Abem * 


sfolssivel Syebrisee Bas, emaem : 


A 


He bon 


eee 


The data in Table I show that the professional drivers 
scored on average 12 to 15 points higher than students on either 
form of the test; and that the nine-point drivers scored about 
mid-way between the professionals and the students. Looking 
at the standard deviations, it can be seen that the professionals' 
scores were grouped fairly closely around their mean score, while 
the scores of the other two groups were more widely dispersed 


about their respective means. 


Using the standard deviations to calculate how much the 
scores of the different groups overlapped with one another; it 
can be shown that about 95% of the professionals scored higher 
than 95% of the students. Similarly, about 70% of the profes- 
Sionals scored higher than about 70% of the nine-point drivers. 
In other words, there is hardly any overlap between the scores 
of the professionals and the scores of the students; and very 
little overlap between the professionals and the nine-point 
drivers. From the data in Table I, it is clear that the test 
makes fairly clear discriminations among the three driver groups, 


particularly the professionals and the students. 


Validity coefficients are another, and indeed the 
conventional way of expressing a test's Norms Referenced 
validity. For present purposes, the validity coefficient is 
the statistical correlation between scores on the test and 
driver status. Table II (next page) shows the validity coef- 
ficients calculated for the experimental, crossvalidation and 
combined samples for discriminating between professionals and 
students but not including the nine-point drivers. Values are 
shown for the experimental and crossvalidation samples separately, 


and then for the two samples combined. 


iovtigd 6 pe . ne 
Junta esata ‘Simpl iv 
ert dvds Pisa Sei 
ajaqnvarelong ond art abe 
side .asase {aon alse : 
bee reqgni& yiao'w extn vant 


d wtesenens sie 


pet? 


: p i on 
gis iota weet ai aluptes shel nety at? oe re: 
43) ,aedaieas onc. Behe Souged fckiiont cna atthe yi cl 
“ainda hbie sé anager 2 idl juode Suds nice it 


-284080 ‘ody 3¢ rate OEE ie ee wns Ve. ey 
[STEM Soi ogseain Sag Fe oF “Sone sarin beau 
agiese sno caeVsaa "alzeve Bs al qabaait Bs, starts atin + f 
Cray bah); saabice si 8o 453008 snl bie daLatvealaliense ‘2 Me 
jJatea~sHic oid Boe sland enasexg end peowted qelsavo a 

tage ads 460¢ 2e9to of 0) GeeideD oi wieb sAy mort” 1 emeyee 
Bare ney La! id? SA wapitiat snatanaimirede saato ¢i34n? “a : 
yretivee ong bas’, Shercenansy-s ae; ylreivon 


os 


en Bberbri 246 isetore (Sig _scpefcntape YIsab iad ve 
bepnet> ie .AmMION ished, aatuwances ‘Yew yaw ianels a", 
20 sastodTieoo y2Eh. kevies noah a0 ‘Ce 
bre 2857 ofS oo. }eIQgue nes mI SC spine? Be dapisess 
“trou. yo the le 243 20OQ MH - : 3 
, BAe tos Jaditevseo To veers 
mite, ALA of 2Re8ho79 "| Cog 
S30 ~ouls¥ “Jexee tab 
. ea schwine 26 iyinse- bosses: fevepes 


RS) 


TABLE II 


Professional - Student Validities 


N Form A Form B 
Experimental 300 2759 #762 
Crossvalidation 150 "720 B7435 
Combined 450 .749 e158 


The validity coefficients shown in Table II are, as 
validity coefficients go, very high indeed. Generally, a good 
test, a good employment selection test for example, will have 
a validity of around 0.50. And such tests can be shown to have 


practical utility with validities as low as 0.20. 


The values shown in Table II also show that there was 
very little shrinkage in the validity of either form of the 
test in going from the experimental administration of the test 
to the crossvalidation administration. In the process of keeping 
or disgarding the items in the experimental version of a test to 
create a final version of a test, one is bound to keep some 
items that were chance successes on the experimental admin- 
istration; and similarly throw out some items that were by 
chance less successful items. For example, suppose that we 
had administered the experimental version of the test to just 
the professionals and the students, and everyone had responded 
randomly to all of the items. From the laws of chance, we 
would have found at least some items on which professionals 
got the "correct" answer more often than students. If we 
kept those items to form a new test, we would have an apparently 
valid test. However, administering such a test to a cross- 
validation sample, also responding randomly, would quickly 
show that the validity of the edited version of the experimental 


test was purely a chance event. 


se .aae T2 ‘at, ate natalie ein iad isu oa oe 
: hade 6 itera: _bevlsnd ‘dere bl 108 Ede Aaya $s 

eves iftw .alqetee and dab natinal se, naarotaey <a r 

avet of nemic 4¢ ust 2FBe9 ‘ous BnA 0820 Bric tie: he she i” 

29.0 24 wel aa 8o271 uate’ palv ete? 

Zén seed 32477. vice Paice Ti- saddet ol mwos2 ‘cae 

ea> 36 «308 ietigte So yatoitay ons a4 spenniate, gl MB 

suey ‘51220 dotspiteintaie imivtanhoegee sitet meee pera rae 

cSigqear 35 sa osc ats 42 .netsersrintods nowt nie: bavg. > wah | 

so f#e5 © 36 nolesev Letaem sneaks Sn3 ni gina 4 '% shasgnlt 

ynce gous of ofttied €F? eno ,7527 5 3c pecesany, ' nee & ote 

ses lerhemisscas SAS Ao teagenee estan Oxi dads ba 

Ye. siow' Isat agssi wpe 73 neatly clsal “paa wi 

ow sakt seengue alquaizs qe: .2miodb 11 f 

That ce! Shas) e053 3S aoleig es i 

bshpbacey bed ano vie ale yebabeds be 

aw weeds 20 sent nee 

re eo | 


v isthe “ 
<eapts: 40d > oe Foe Be 
faroonkuseve ang 


Pm ce hee 


Of course this is an extreme example. Nevertheless, 
the process of constructing a test is always diluted by some 
degree of chance relative to the genuine validity of the test. 
Crossvalidation shows the degree to which chance has been oper- 
ating. If the crossvalidation validity values remain close 
to the experimental values, one can be confident that the test's 
validity is stable and relatively undiluted by random effects. 


Table III shows validities for discriminating professionals 
from nine-point drivers, with the students excluded. The values 
in Table III are substantially smaller than those for discrim- 
inating professionals from the students. This reflects the 
fact that the nine-point drivers' scores were substantially 
closer to the professionals' scores than those of the students. 


However, the validity values are still respectable ones. 


TABLE III 


Professional - Nine-point Validities 


N Form A Form B 
Experimental 300 -481 437 
Crossvalidation 150 .449 .429 
Combined 450 Sa ~434 


Finally, Table IV (next page) shows the validities obtained 
for discriminating the professionals from the students and 


nine-point drivers as a combined group. 


~32z Et nd f 
‘ 
~ 2 = i r 
4 Z a se rm = 
' a ~. ~= wi 
‘eu «AT 4 co oy egy t 
- wi — fat de 
4 hae | 9 ad Ve 0 ta rh 
? f 7 4 
= o re | y2 ai te 
4 - AN 
" ea ! i 4 
= ; i 
= - = ie _ 
i »b 65 4 { ; 
1 
< " Z| 
—_ 4 o- - _ 
h 2) cs 
J i hii - 
wan 
_ ae iGo 
7 j 
ife, 
be Cel 
” - 
im mele & 
- 27> ee : 7 ¢ 38 ie || ma) 
bis sSrsboae ‘ei mor 
st 


a) 


“que 
ft 


r= 


\ 


Stade od: 6 > oe 
a eos - 
Bost «bre tr vid si tne “ots es 


‘al 


Ay hand sees 
bi ehortario aa: Srukabmtme ie a on | 


1? Beri ats i -_ eit 


t 
} bef a i 
da iw od us Py © the 
yi iteisanas ects 
it) mot. if 4 
rey! Es S406 nin 
he 
=_ 7 e ~*~ 
rd 5ITS ~ & 7 A 
F 2 im & * Vv re ahby Ry “4 fiz 
i 
' 
> 
ice | 
Lei #8ac2° 


at ' 7 


Zo = 
TABLE IV 


Professionals - Nine-point plus Students Validities 


N Form A Form B 
Experimental 450 509 -549 
Crossvalidation 225 -530 -529 
Combined 675 oo -544 


Taking Tables II to IV as a whole, the validities are 
consistently substantial, consistently similar for both forms 
of the test, and remain quite stable from the experimental 
administration to the crossvalidation administration of the 
test. In short, the test meets currently accepted standards 
for Norms Referenced tests. 


Reliability of the Knowledge Test 


A test's reliability gives an index of its measurement 
error. Any measuring device is subject to error. A steel 
tape's reading of the distance between two objects includes 
the true distance between the objects along with some error 
due to temperature fluctuations, tape sag, and so on. By the 
same token, a driving knowledge test score contains an individ- 


uals's "true" driving knowledge score, plus some error. 


The reliability of a test can be measured in a number of 
ways, but the most common way is to measure the test's internal 
consistency. This amounts to seeing how well scores on half 
of the items in a test correlate with scores on the other half 
of the test. Table V (next page) shows this split half rel- 
lability calculated for each form of the test, and for the 
experimental, crossvalidation, and experimental plus cross- 


validation administrations combined. 


io 7ogML 


[arwasis 
Tre ‘ 


~i'_> 


‘efdo owl neewsed soreterd sric te paitms i a 
“9iw opéle 't#gnsta0 aris reaws < {mira th 


isjnc> Srone +25! aypbelw a enNeahe a t pein 
- J e 
shoe mig (eapee appelwo ¥ Sb Yaya” 


‘ A - ¢ 
ac: ei1ie lid 7 ai A 


ride eta “aware 


vob car ainiete: x44, “36 


— ion I 


bis , G82 eGad «Snotsauszus = qiudnvecmie a0 


=! 
"0 


is eo, 


- 
- 


ee 


TABLE V 


pplate-halfeketiabilities 


N Form A Form B 
Experimental 450 Se . 888 
Crossvalidation 225 me te . 866 
Combined 675 .848 .888 


Relative to typical test reliability values, the values 
shown in Table V are somewhat low. Generally, one likes to 
see reliability values of 0.90 or higher. However, low rel- 
liabilities tend to be associated with high validity values. 

To be highly valid, a knowledge test will generally have to 

be one that samples a broad range of knowledge. As the range 
of knowledge that the test samples increases, the greater are 
the chances that knowing the answers to some items will be 
unrelated to knowing the answers to certain other items. This 
in turn leads to lower test reliability. Thus, the validity 
values achieved with the present test have been at a certain 


amount of cost in reliability. 


The primary importance of a test's reliability is in 
deciding how much faith to put into the accuracy of an individual 
score. For example, it can be used to decide whether or not 
the score obtained by one individual is significantly different 
from the score obtained by another individual. Since the test 
was designed for evaluating group performance (whether or not 
students in one driver education course perform better than 
students in another course), the reliability of this testurs 
not such an important consideration as it might be in other 
tests. Therefore, we will not go any further into the details 
De sthestest scereliabl lity, and atseapplication in interpreting 
individual scores. These details are given in the test manual 


HaWearen. 


ty ev A bon 9. eulise 
ver yu ucsen ae pee es cosas S wht cet etaped . 

* slwoanth  . or be 18 4 i Bre gl Gandy send ony 
5 ait .esegemoml esiugee Jost s42 Jade wobeloqiers 


; =I 7 of @7eschs occ vonlwory Fed? @29nene: 
‘ a a - 7 - os : 
7 é 7 — 
_ } 7 ¥ 
ri ys a a qq i eo Q3 rvs Aw pil 972 Pn pwr o4 ibe - Se 


y 
: =i, anon) f Oa vail tte lites Seen revel ae areal at a 


6. 6 2s et eA i@e3) THenooag, ar) edi bev ptitos eouls 

va. itetion @) 20 to Jago 

- t ar 

a #2 .bidsiie: et S52 6 46 Somta og ee | : 

bi +! ng 16 Yoecotpe sep OSRE 70g oe ASS nae wor p ben 
: a) 26 Teo 2i, 9 mans 1 * | 
"Scas% (S2neo ie = SUB tys5e: erie yt betce ex CF 


i sider ap Fe site as099 - * 


> P amd 

7 ine a 
yuoTe DI STehE 4 762 bengia 

iif a on at be . F ericak: 


- per ae satan 
ra ' er pans aE! Le 9 


»>—- : 


eee ae 


This completes the description of the main results for 
the Driving Knowledge Test. There are still some issues to 
be discussed, the performance of the nine-point drivers for 
example. However, it will be more convenient to discuss these 
aS general issues, common with issues arising from the results 
for the Driving Situations Test. Therefore, we will move on 
to present the Driving Situations Test results, and return 
to the general issues when the results of both tests can be 


discussed together. 


} aie ? : es a ae A ait 5 
~ 
Ted Bee 8 syaat Leasn 


Pe - Sy ; 
De Sect Sad 4G eeusie ail ppd = hy 
~ eet Sinker : gir aod ‘4 a9 dak 
oaks We 63 sf3 nsw 2aueec 77 sage | 

. - initYanoy ob pee 

% 4 th 


sh ae 


CONSTRUCTION AND VALIDATION OF 
THE DRIVING SITUATIONS TEST 


The terms of reference for developing the Driving 
Situations Test specified a test based on asking an individual 
for estimates of the probabilities of adverse outcomes of various 
driving situations. The adverse outcomes were to include accidents, 
getting traffic tickets, and so on. 


Developing a test that would meaningfully measure risk- 
taking tendencies on this basis presented some serious diffic- 
ulties. The central difficulty in developing a test focused on 
probabilities is that risky decisions in driving, or anything 
else for that matter, are not just a matter of a person's 
subjective estimates of the probabilities of different possible 
outcomes, but also a matter of the person's expectations about 
the gains and losses associated with different outcomes. For 
example, two people might have exactly the same estimates of 
the probability of an accident in a particular situation, 
but come to different decisions about driving in the situation 
because their subjective perceptions of the overall gains and 
losses are different. Conversely, the same two persons might 
have equivalent perceptions of the gains and losses but behave 
differently because their probability estimates are different. 
In other words, a test of risk taking has to reflect not just 
an individual's tendencies to under or over estimate probabilities, 
but also his or her subjective values and priorities. This was 


the approach we took to developing the Driving Situations Test. 


Taking this approach to the Driving Situations Test also 
had the advantage that it made the test a reasonably straight 


(9) There are 


forward application of human decision theory. 
in fact a number of theories of human decision making, but 
all of them conceptualize decision making in much the same way: 


how. people make trade-offs between the probabilities associated 


a 
ragts - 
. . _ 
a “te ade 
a 
aed. aay eh. 
sy EmcY wird. tetye 
a) 7 Pee ae — 
7 iy a 7 ; - = 
— ~ rere Fd 
or \ : wdc * aie Cereb j ey ; S9F 238791. Pure " es iy 
- 7 < ee a 
rie bban < is 4c heesd’ ‘ae 7 Ee Saltiosad ens 
Jah - Py : = Tr 
ROSA wO isa7evbp G2 opis 4 dngiiee Soc ta 3ea8 - 
—_ ¢ ae? UP = 7 - - £¢ a6 
yg: aPRISAI> OF atow eaneodie BESSV OE: Fite eames os 
| f -0 . a ‘bee ,atarzots ciavetaeg is a 
dj om .) oe so 


= ea | j ey cc. sniaeies 
Pe m vide engiies Dloow 2582 Ee & pniqos Ss) 
nis : ~ oat : ‘ - : 


oe 


7 7 7 


is ie ‘oi. 
+ : 
Soran) +224 } f Vv Or ve husid=thoding 
a id , a 
wus ~ ¢ ey 7 
ad 7 nom rk af sf ariel Lr Nok 7% * 
i fi4 sr) 7 { aa ks 
- ~ © tc a3 i) Et “O38: ) GEIS RM. 
G as | < oh « ‘ 
Fs _ up a 
Ai 3 23 is 1a ¢ vi Gos af Em i+ 40s 
} “~\ Tac ‘4 2 ° 7G 5 ole 7 
+ = eh 
’ — sei bpistace besetos oo 


7 
is 
i] 
hl 
w 
so 
et 
eS 


, Wed fo 
Ome; - 456xn5 93VE ay son ies 
> t 
bed ~2$9nc, 8 “i IN88i297. HA aoe 3 
— a 1) 
ek cae i. 
a a I | ic 2 Le ‘ AS oy “- 


Scite¥evo ars. ip eaelsgeoued 
~~ & ” 
mes 5 mi jp eo 
j ry 
| 1 ae BA3 .40 
mi . = as 
Tr y ~ 
= le. =9: i328 VI2ii ipwese 


>» 
f 
f 
eb 
ij 
LL 
a] 
f 
= 
~~ 
‘a 
a ee 
ke. § 
x 
@ 
& 
t~ 
be 


, mitce toeG 40 seUaby ss pitts fe 
7 ‘@atty ae st Ine aS iiny oviaosgye 


onre pT 


aL 
i he Be sri: 
“on 


eras. a 


with decision outcomes and the payoff values associated 

with decision outcomes. For the purposes of the Driving Sit- 
uations Test, its construction became a matter of creating a 
test that would measure how different people make these trade- 
Sriseinethesconceext of driving. In turn, adopting the most 
commonly used method of studying human decision making, this 
meant constructing a test that would present a variety of 
driving scenarios, each of which would involve an accident risk 
on the one hand, and on the other hand, certain gains or losses 
to be made from driving. A person's willingness to accept the 
accident risk relative to the gain from driving in the scenario 


would then reflect his or her risk-taking tendency. 


To provide a systematic basis for creating scenarios, 
we began by defining five driving factors. The factors were: 
the physiological state of the driver 
the behaviour of the driver 
. the condition of the vehicle 
. the weather or road conditions and 


. the reason for driving. 


To keep each scenario reasonably simple, it was made up 
using three out of the five possible factors. For example, it 
might describe the state of the driver, the condition of the 
vehicle, and the reason for driving. Other scenarios would 
then contain other combinations of three factors. The Driving 
Situations Test then, consisted of a series of items, each one 
containing a scenario for which an examinee would rate the extent 
to which driving would outweigh or not outweigh the risk of an 


accident. 


The five driving factors were chosen not just to provide 
inspiration for making up scenarios; they also represent basic 
factors typically used in analyzing the cause of an accident. 


As such they served another objective, finding out whether or 


rT 4 ee = 
ieee se 


as > i 
anive 30 ant Fo! iam re 


at 
£ Smtaeis ac + 559; $50, 


-sba02 eect sisit otis pepo 


22@n ofi0 pete pube “4h 


a . 2 aa: a in we ¥¢ 
noo, 2 


i859 .tyitaae rey 39h 1s 
- 3 yfelasv s shea bio iagov 4e, ent ane pee ‘paedoum 
ts feebinas ne 40 oval Pldow de Liu 0 Ox 1) SOE : re: 
yeol WO. ented ctettne ybaes ashito 9 $13 ABs, bre’) med one 
nits 4gecng B24 esrval ‘etre 2 ‘qoev9d A vaiwie mass wei 
2 red Shiveiab aoad. nish eae’ ereneine seh 


[on @ 


- 


‘enebrsy pnises=vebt aed! 20° dtH saettor 


he 7 7 
= > 
chimes: patdeeas sol vetead oisemateye €,sb0v9 1 
* ain 
aye™) BI0cIe7 »  -2I0RORS, Gates ab: ovis Wr) que enrhe “a on 


sevizb sts ia OTS itd 
sloidev eit ig nolsbbens ste. 
| | pos DAoOlsiynos ORO, IO sa oe ots 
poiveth (62, fauess oie « 


ae - 


Wi Genozeds oft IER SSS I ne 


“a * t J § 
. . 5.) maxes? 3208 >yeeee%, elds zac” ‘S022 « gts 
eu 4c notacbeoe atti. WaV> 3b att To jai 
| bi miw, aot segsba: Teast ; rOREN's LS. 102 \ 
pri 70 sat °. s2GSs6F eord5 Ac anoisanig 


bgee © Be 


—t3o4 HE a8 


: IE , 2: ; 
. > 5 
oe ter Ww a ge: ipatbad da, 


LS a 


not particular factors would have greater or lesser weight in 
determining a driver's assessment of risk. To achieve this 
objective the test items were constructed so that different 
factors and certain of their combinations would appear equally 
often on the test. 


To make the scenarios concrete, each factor was defined 
in terms of four specific actions or states which were also 
to embody key attitude criteria defined by the Phase 1 experts. 


For each factor these were: 


The Physiological State of the Driver; 


. drugs/alcohol 
. fatigue 
visual impairment 
- emotional state 
The Behaviour of the Driver; 
. speeding 
srabigacing 
. disobeying a sign 
- no seat belt 
The Condition of the Vehicle; 


. brakes 

. steering 

. tires 

- general mechanical problem 


The Environment; 


we rain 
. snow 
Lod 
ice 
The Reason for Driving; 
- no compelling reason 
- pleasure 
emergency 
obligation to another person. 


Given these definitions, a scenario might then consist of driving 
while fatigued (physiological state), in rain (environment), 


in an emergency (reason for driving). 


aap vd bea l2eb f0 19891 30 
_ i 


~ a sep? 


ms igpe 5.227 


als i | 
ave J wa. 


: x 
a33h8 eee > 


: : ~ 
sm f - 
: - a a 54 —4= Jée anialeval ons - 
ied a : i] 
i i £ > <9 ih . oe 
e yy : ° & : 
, r : im t 
a 3 4 ® i) _, 
4 i 7 
a | ae yt) * = i 
wT, 2 Laie. co « 7 
a 

rw) sic 2o 2 ; cilnw we 

ts 9 -: « 
r 
i Le? ® 
* , 
c } 7 +84 

o ry ® 

; . on 

ivi Ast ) 30 NOLPL LAGS att 

: i 


191603 igo Cee om 


on tomy ose 


> 


- Ngee = en ae >, * : 
: aveuselg « 


7 : siee 


i 
if 9 nae Py . 
- * Poy 
:eommmerena saz 
2 y vo 


la 
t ‘a8 ips e atts 


Te a) 


- 


96. 


A test. that included all combinations of the twenty 
possible states, taken three at a time, would contain 1140 
different scenarios. To systematically reduce the total number 
of scenarios, and make it possible to analyze the later results 
in terms of the driving factors; a statistical design called 


a Balanced Incomplete Blocks (10,11) 


design was used to make 
up a reduced number of combinations of factors and states for 
the scenarios. Details of the combinations that emerged using 
this design are given in the User's Manual for this test. ‘*) 

For present purposes, we need only say that in the resulting 40 
item test, each main factor appears a total of six times, each 
pair of factors appears a total of three times, and each triplet 


of factors appears once. 


To respond to each scenario, the examinee was given a 
seven point scale representing a trade-off between the risk 
of an accident and the gain to be made from driving in the 
Situation. The scale was to be used so that the number "1" 
would represent the judgement that accident risk would greatly 
outweigh the possible gain to be made from driving; the number 
"7" would represent the judgement that the gain greatly out- 
weighed the risk; and the number "4" would represent the judde—- 


ment that accident risk and gain were evenly balanced. 


We did not ask any experts to do an equivalent of an 
Angoff procedure, or any other form of relevance rating of the 
content of the test. Firstly, it had been relatively easy to 
construct scenarios to reflect the Phase 1 experts' requirements; 
so there didnot appear to be much to be gained by going to a 
new set of experts. Secondly, the test was very carefully 
constructed to balance factors and states among items; so there 
would be no freedom to add or remove items at the suagestions 
of a new set of experts. Finally, having constructed the test, 
we saw that it would be very difficult for an expert to do more 


than guess as to how drivers should, or would, rate the different 


7 = , a | by e 
“anor sheet roe MOD hte Betu 


i] 
a . 7 WwW 2S eo : 
- bb il. niasnas olpes o \anbace oE 
: Teale 
-erimiat Deze’ edt Suwa t phsit 
se j » ou at 
nftigo actegh Lasi 
” ? 3e0 a Ff: r 
oc. 7 
2 pagese Cot Sreeses Yo#aStonri mer So yodmyn cal byt 
tile Bes re. Sei? enobd otis ins, ang. ie afievad tod anae 
: ee ican &¥ 
at as ais Siem e! Kanth agit RL #2 2VLe 82h AB . Td 
7 a 2 a0 
Hh plc ivaes edd at sens pe Lae iy? 298 cog save - 
a xis 3c Jtieg & Si8aqge ose! nhein Hese - a i 
‘wn ; : 
a 12 it 5s fia ,eonm 79) 2 ao a 7 ie = S ‘Suqge: ews “7 To 
.gann, teeqqe A107—8 
a 
g novice esw Sentimtze sts , oO: tenets: Gye ow borage. c? A 
-« a4 seewted *2o-Sheas sc paignaeseTae7 einee® In10q me 
on? vist mo72 sbee of G2 whee on athe tasbioos t 
vr” te 9) 4 %en5 Oa) De a0 9: #6 “82 a ed noes 
Videete hivew ve ivehinoe Sé6tta sodeneobur | Eade: theew ves oie 
eaain afi «nattiab got) ehem eo) oo gti ons tela 
: Visgee to nish ens 268) scenenee,. ait? rales biuow © 
bs : >, : ; - 7 
—atle rs apeyoae- bitinew "8" szs0ntn Smo iris 2Awls si2 dpdipie 
3 7 ad des 
D ' : a 1 al - 
_t Qeiso vinsvs =4 IT) @0 Mare iwiy Gb--0 jens > 
_ | an 
- : 
t - 
da Bo Joe¢fevitve oe ob G3 2378qxe yok Fee 
4% *9 goolszet soneveies io. mavi  zo9t: 
ee VYiewlseler fed 
722 48) tu ; '‘a@>xemxe. J 
ae ne 
6. 62 ceiroo va Bact 
WE {eI .HS. TIAN) BA 
azeais oe vamert pnoms fa 
: Frame site ef Be BA o 
S0ec 9&3 perawatanop gat 
| - 
esor af @2 232974 


; SHSTAIIIS Grs ade245, 


ae? pom 


scenarios. Thus, aside from the input of the Phase 1 experts, 
the Driving Situations Test was designed to be more a Norms 


Referenced test than a Criterion Referenced. Test. 


The Driving Situations test was administered to exactly 
the same drivers who took the Driving Knowledge Test in both 
the experimental and crossvalidation administrations of the 
tests. The description of these drivers, and the procedure 
for administering the test have already been given in connection 
with the Driving Knowledge Test results, so they do not need 
to be repeated here. 


As will be seen shortly, data from the experimental 
administration of the test showed that the provisional test 
had some validity for discriminating among driver groups, and 
that all of the items on the provisional test contributed about 
equally well to the test's validity. Therefore, there was no 
reason to disgard items to produce a final version of the test. 
Given the highly integrated nature of the test, it was likely 
that either all of the items would turn out to have some validity, 
or that none of them would. Since all of the items turned out 
to have some validity, the original version of the test was 


crossvalidated without modification. 


Validity of the Driving Situations Test 


The Driving Situations Test was scored by calculating the 
average rating that an examinee gave for each item on the test. 
Thus the minimum score on the test would be 1.0. This would 
indicate an individual who consistently judged that the accident 
risk greatly outweighed the gain from driving on every item. 

A maximum score on the test would be 7.0. This would be the 
score of an individual who thought that the gain from driving 
consistently outweighed the risk of an accident. In other words, 


a low score on the test would represent low risk-taking tendencies, 


e 
2 
ra 
= _— 
ai te ot 
Tal 
wart 
2qQ 
oa @ a 
. 
bat | 
« 
Sf -~ & 
ry © 
a 
= Ss 
- 


, abs 3e 3 3s 


-aacgnsgns: 


@ ot 
da 

> 

1. 5 


: , Ps — > erie apr: fab Lbs 286 ryt, hie casatat 


- , 7 « - > lta 


; ; . a B{T if 
an Toe ayes > me 1 sé seat > 
>i to + Ad ed whe). —s r : 
des MON atin Sun tete 
evil ? oS ali Rec a 
ave | - _ rc e 
pl > { 4 4 oy i 
*» a= ° - h 
ot P 
4 


. : » | or “eos 
7] 

Sern 
seat: eds o 8) gt nial ‘ 
= sone 4A as tise 

3 


HaeT be VAOGP? 


» — : 
i i > —— Cm 
~seiciess een Fea 3 wigldnd: im) pad 


bolwonw One\ ead iss fans ormw, 


, 
. ' : 
oe ie = = 


b Sami? $0. bik ingeah at 
avig tied yoseris sven SheT ert <_= rans 


‘ avuiat oes 


~ 
“~ Af C meas é 4: 935 © . 
» : ne 

re A ro ye s+- eee ti) a 

j 4 at sie 3 ‘ 
, ~ ey (res {oss Te ; 

7 ? a 7 a r 7 

7 : : =e 1 _ 7 a 
io ¢ 

ad tio nee | mast 9g 3Ocae Ae 

sh 7 os 7 1", 0) , 


ric ee tle : oe 


ga) .6hGs2 4 
———— a ene 


me i es a 
wltut as 


= 
my. | e ede ye bp) oe 


a 

+ 

» 

ee 

« 3 
ae he 


oe 


and a high score would represent high risk-taking tendencies. 


Table VI shows the means and standard deviations of the 
Scores obtained by the three driver groups. The data in Table 
VI are based on combined data for both the experimental and 
CrosSsvalidation administrations of the test. Data for the two 
administrations were combined because there were no Signiticant 


differences between the results for the two administrations. 


TABLE VI 


Means and Standard Deviations 


N Mean Diarbie 
Professional : 225 1.89 ace 
Nine-point 225 2037 toy 
Student 225 2575 - 896 


It can be seen in Table VI that professionals showed the 
lowest risk-taking tendency, students the highest risk-taking 
tendency, with the nine-point drivers falling mid-way between 
the professionals and the students. The average score of all 
675 drivers combined was 2.34, showing that drivers generally 
considered the scenarios on the test to be on the risky side. 

This is consistent with the fact that the states used to represent 
the different driving factors typically reflected adverse dravinge 


conditions rather than good driving conditions. 


There were also differences in the standard deviations 
for the three groups. The professionals' scores were grouped 
closely together; the nine-point drivers' scores were slightly 
more dispersed; and the students’ scores were still more dispersed. 
The differences among the standard deviations probably reflect, 
at least in part, the region of the scale used by each group. 


mhe professionals, by confining their ratings to the lower end 


Th) ia Gs 


ud ae, - Od apa 


S33 (eae ae bse ics ange ~ 
ane $03 ‘a tuens ah! ran Iae 


‘aszat Pet ‘oh 
i‘ 
: oe oo ' 


be r y - 4 ne - 2a 
znoljeivss Pasthwss22-Ors Shee 
i 

1 teu i? - 

7 7 +) Pale 
Ca ee Eat Lawl nem 
wy « = 5’ @ : 

i ron | 
~~ ae =e ° arr : —anlen 
Le) « 7 . © « ; : | 
~ 8 : . a oS er 
e2 . » id : 
+ 


/ 


' , . p 7 a. 
obenstezotoig a6ms TV SfdeT oc? pees wf nem $1 


ow | ' 
49 <S9ReaBCo ays csened Baits j+aaby 38 


c+—ant dy 286ngiay aes “cane ‘Xe * 


=) 
iti Cas Lbed, STEVIE. tog-ania oth 4 oon 
: , =e iad 4 
ge «waqbhuse Sis-/Bag aanotes ete 
kiss saw Samqags avevire 


vitgsocab Boe. sh Jens pareont 1 2h 


siic wAbea @pirhc ae 03 aee2 ss Hm aon ani eas aaiet : 


— 


a Bi a2aigJ3¢e eres S83. | tus: ars we 


6 seasebe Badrsises, hae SB sve iey08 - ve fb anerentih 
\ 2 ew & ~ > Same 2 ; 
ari £5 steer orev | badd me : 


“ye an ah 
: ioe HE 


34 ait 
pata an 


a bah OD 


recat vale brepiaee . 


; ; wi >. J . se 


oLit ror ee a aecostiler 
7 » 


Soca 


of the scale, were bound to yield scores with a relatively 
small standard deviation. In contrast, the students, by using 
the middle of the scale had room to produce scores with a 
larger standard deviation. 


Using the standard deviations to calculate how much the 
Scores of the different groups overlap with one another; we 
find that about 75% of the professionals scored lower than 
about 75% of the students, and about 60% of the professionals 
scored lower than about 60% of the nine-point drivers. Although 
there is generally more overlap among the three groups on this 
test than there was on the Driving Knowledge Test, the separations 
among the groups are still substantial. 


Validity coefficients like the ones calculated for the 
Driving Knowledge Test, were calculated to see how well the 
test discriminated between the professionals and the other 
two groups, and to see whether or not the crossvalidation 
validities would substantiate those found for the experimental 
samples. Table VII shows validities calculated for discriminating 
between professionals and students; and not including the 
nine-point drivers. These validity values are relevant to the 
problem of evaluating driver education programmes; the problem 


this and the knowledge test were designed to address. 


TABLE VII 


Professional - Student Validities 


N Validity 
Experimental 300 Assis 
Crossvalidation ESO -407 


Combined 450 -485 


« 
f 
i age 
- . Ae 
~ 3 
= ° t 
~y F 


- 


_ - i ae 
Tae , mein 1 cies ‘6en. Bo o> ats. _ = 


- steer atc ) wi hate:zonit 


” ae 
canbe Rive eB eeeeTe 


«! 2363 


: -scivan \ a2 sine oo” 
, 


to 


ta lw galaave *oue7t 
7 7 DV i-d 
: > _ Uoee) aie wr 
ae $0 on Le io Sor - iw 2 vet -” 
73 : ; « 
pul stots: us 
z . : 7 ss 
m4 ¢ P aw t ’ we he vo iBstin vi L STOR Se a q 
ae he Liu mil oa ad - i : : : 
anal Gg’ > 6 +, ) } j ao © S4 4 2SW ? eae asne 2: 


> ' .¢ 
we TLij4s 2%5 = Tposl ond om 


. - i = +7 ’ >» .. sc wr éY 

= a a - > 
. i » - : oa” rt 

aaa Oo? 29S cmar Ss S's 5 : 2 Se pe tas &t ty 


i 


, deeeerrag 


p he. i’ son Sa pitetw ese of) one 
+ sea’ San oi eqaty % jsricige Bopeaw ener 
ss L - " aa 3" j J Felees u 4 aici 
t ‘¢ a 2 ¥ a a ie tak aot 
; geertl = sev af 
— & u > a ~ 


sits) Snehisc © feraises7gt4 
} i i / 7 
Bic ewiay 


Ete 


Table VIII shows the validities for discriminating the 


professionals from the nine-point drivers. 


TABLE VIII 


Professional - Nine-point Validities 


N Validity 
Experimental 300 s292 
Crossvalidation 150 SEEM E, 
Combined 450 2S)! 


Finally, Table IX shows validity coefficients for dis- 
Criminating the professionals from the students and nine-point 


drivers as a combined group. 


TABLE IX 


Professional - Nine-point plus Student Validities 


N Validity 
Experimental 450 36 
Crossvalidation 225 2543 
Combined 675 SS Wes 


The validities shown in Tables VII through IX are all 
Statistically significant at at least the 0.001 level. In 
addition, there were no significant differences between the 
validities obtained from the experimental sample and the valid- 
ities obtained from the crossvalidation sample. For this 
reason, a validity for the experimental and crossvalidation 
administrations combined is shown in each Table. Since the 
version of the test administered to the crossvalidation sample 
was exactly the same as the one administered to the experimental 


sample there was no reason to expect any loss in validity due 


Ja%0481 + sur iis) 62° BSS 


e135 tri nis To — 


er : sn 


i be Le 


enelees?: 24 


a 
yoibie av a. 
, oe 
[ey Hor Le 
* U F 
fet. 62; | mek sebi Le 
' 
a 
ELE. Jen : 
} j : 
2 Steers voibibes aware Ki older ,oite 
7 J i] 
7 ; —— i. 
s 71> DAL 828 ape i ars Ti2 7 .nmnos . a fa? 
vote, Esnzdingdo & 
i N 
xT Sale i 
ris 7 : ijices | I C ses 07% 
biLi? i 
get. O2h ‘esnem] 


esi. o£¢ NOLJ2 i eves 


a Aft E 2 feo 37 (32) aitieT- of wore. ¢ sto lmiley set 
vel 100.0 949 2e6S! ge se Sricoi tingle vile sae 
wiz nsewsed 290044 Seams sheers ers .a02 “0 
> bareesdo eis ibks 


‘ee BAN baci teatit nah 


of: es Soi rey ztol, 


owiehe 


to discarding items from the experimental test. In fact, in 
the case of the professionals and the nine-point drivers, the 
validity went up slightly, though not significantly, on cross- 


Valldatvon: 


The validity of the Driving Situations Test for dis- 
Criminating professionals and students is a value that would be 
considered moderately good for a test of this sort. The test 
has only fair validity for discriminating between professionals 


and nine-point drivers. 


Reliability of the Driving Situations Test 


P Table X shows the split-half reliabilities (like the ones 
calculated for the knowledge test) for the experimental, cross- 
validation, and combined experimental and crossvalidation 


administrations. 


TABLE X 


Split-half Reliabilities 


N Reliability 
Experimental 450 -960 
Crossvalidation 450 -967 
Combined 450 -962 


Since there were no differences among the reliabilities 
calculated for the three driver groups separately, we have 
shown values for the three groups combined. The reliabilities 
shown in Table X are also essentially equal for both test 


administrations. 


A reliability of 0.96 represents a very high reliability. 


In addition, the reliabilities of individual items on the test 


tote 
‘4 opted 's. eh Sandhose ‘bis 


_ : > si‘a —— 2 
+ 2) is. Bo 2839 £70) | eect | soet si 


of snr 7 ; J 
j — I 7 a7 ~ > = > ey = P * eS a lwo 4 4 ; 
4 - . 
@ tS . - V2 « - on a a4) i =ayx ‘_ abs a 
. wi) LA) 
bs ‘he | ] s:haed + 7.8 3 ! H i 
ws bf Ae i: be 1 
i one vcr 
,ee- i? ; ait 
- i — . 7 
ates - i 
y 7 
potty tl bas > Bat. pee wey TSTTLO on steu ih Bi. 
vg : ; ae): very ied : 5 ; ; 
' j - ° F 7 wn £ , 
av6 im \,' Ba % teatneah ri bs | dicah'y fa >& 4 pine aoe o 
_ . - i vA o : - > ‘ 
: oe n » | a4 © ma 44 dg 7 5 ¢& 
norte ifiestiew si: oband So NGWILE 227 {4, -argig 


an 


je nifed 108 Shape Pelelspsace ocle Bre: ¥ : 
sa, ‘oh? ; : ae 


oe 2 


ranged from 0.50 to 0.70, which also represents quite high values 
for individual item reliabilities* All of the data on the 

Pest 7s retiability pointato the conciusion that the test is a 
highly reliable one, and that it measures a single attitudinal 
Cia at vie 


Effects of Driving Factors 


We analyzed the data to see whether or not the presence 
Of any particular, factor) ini a.scenario, vehicle,condition for 
example, would cause drivers to give higher or lower risk 
ratings. There were no significant differences among the 
mean ratings given for different driving factors. In other 
words, not only were all the driving factors given equal weight 
as suggested by the reliability data, they also made equal 
contributions to the total score that a driver got on the 
test. This merely means that the states used to represent each 
factor (rain, fog, and the like), were such that the total 
effect that .a particular factor had on the test score was fairly 
constant. If we had wanted to, we could likely have manipulated 
the contributions made by different factors by simply defining 
less risky or more risky states to represent each factor in 


the scenarios. 


This completes the description of the results for the 
Driving Situations Test. The results show that the test is 
a moderately valid and highly reliable test for discriminating 
among drivers on the basis of their relative sensitivities 
to accident risks. We will now turn to results that describe 
the relationship between the scores on the Driving Situations 


Test and the scores on the Driving Knowledge Test. 


* Item reliability is the correlation between each item and the 
total score on a test. (See page 12.) 


i. 


a 


Vv pad) 

J fi 
4,2 ie y i= bp a 3 Ps 
‘a Bead Bisviah a 


ih anes ginpts os 


ae 


4 = > 
4 hed ate ipsa 108 


sar. 2b wrt Tie. S ougu tiahe 
,62ea vehi de is a sth yd: besailp 
9648 Shane. Leics ada, 52. phe stasten 
jade; ils. Weds. epogh eran sb og 
suas scien Pha Bos: POR, ikea) x 
mc, be ri sojpest aésuaty Gp 5 te * 
Idop) S407 besitew ope aw is atm 


Me. 


: evigeles 7? aid Yo 348 


edt Pee ric 3 (won Lis Pon 


ig . 
stgline ¢ PS209e one, nae 


cesT epbe Lge iad 


ae a: 


oh 33} ‘i 


Fe poi 
; -* ; Pp 


Sera tc ee 


RELATIONSHIP BETWEEN THE DRIVING KNOWLEDGE 
TEST AND THE DRIVING SITUATIONS TEST 


Since both tests were given to exactly the same drivers, 
we can show the correlation between scores on one test and 
scores on the other test. In addition, we can show whether 
Or not using both tests together gives greater discrimination 


among the driver groups than using either test alone. 


The overall correlation between scores on the knowledge 
test and scores on the driving situations test was -0.380 in 
the case of Form A of the knowledge test and -0.377 in the case 
of Form B of the knowledge test. The minus sign in the correl- 
ations means that as scores decreased on the Driving Situations 
Test, scores increased on the Driving Knowledge Test. That is, 
the more risk a driver assigned to the different driving sit- 
uations, the higher his or her driving knowledge score; and the 
less risk the driver assigned to the different driving situations, 


the lower his or her driving knowledge score. 


The correlations between driving situations scores and 
driving knowledge scores were essentially the same FortalL 
three driver groups. The value of the correlation between the 
two tests is statistically significant and it suggests that 
there is a moderate relationship between the tendency to perceive 
risk in various driving situations and knowledge of safe driving. 
The most sensible interpretation of this result would be that 


knowledge of safe driving serves to increase risk consciousness. 


The partial correlation between scores on the Driving 
Situations Test and driver status, with knowledge held constant, 
gives a measure of the test's ability to measure risk-taking 
independent of driving knowledge. The partial correlation 
found for discriminating the professionals from the students 


(combined experimental and crossvalidation data) was Oe Bee 


ait fis vie oabe es Wed oone 
“ow ait so2 tesiun ede were 

ee et .dgex 2 ide arta 
HG areas ag 
Mets ~ 29 Stiost, avin 


ti ata! i | z i Pay ; | - = 

ha leita - amt no ¢eagae sew: Fad el deretas “tile wire ar, 

7 pes Ls 4 fae 

O9L.o< saw fea’ ‘Sc hte usd ygndvtah eat ‘nePiawrc Pa a 

~ P a - re , Chins ans ae ei % > oS wenn. '* aod 5h a Ne Soa 
4 { 7 


7 : — is ; e oe | Re 
“Terveo sty ot cove Sdakm eat sadenetphed bis» as to ae 
“er74e er ivrtaqd ede fo beaeeisst atehoae : dd. Jere aeke a 
(Bt 3 ‘ 25% ‘ Joe] won er: of " ‘eds ite ‘BS SLAG ION, ; be: = 5 
ey ye : : — Pots. 
E 97141271350 e642 OF, Donbiges- 257175 & Mees 9? = 


scr ire 1coeees}alwand ernest ab ger! a0 kr sane? ll dd etm pe: 
; +348 ; : tab. alte al ¥ 
leuz le privish ganoe e828) eds ses Benctees tzavind « 
‘ oe J Sle a a shed if pe : a = its ee 


23668 G@ribed wats hay Bae) +4 1 ise a Lit pene oy 


227998) ShoisesIl Se oniviad jawed de ator ofeach on? 

ile 203. auee 209, Vis) 1aebe> O346 2650 22 Syhal won: vei 

suc vee ee nol Ssleison na? 30 suley, aa> swgborg ‘svt 6 oa 

7541 sezepedg 2 “Soe freee rata 

tags * yoteOts? SA RegRS GlaeicsHs sciaiel 

ecves {e2, 00 sphacwoas bwe shabgen egy 

seds at Slow Sivesegeeae se nohons 4s shfeses . - 

Babee esr apnes. aziy dead gavSes) eT ee e@ 
; ; Pr oi Aa) ; 


pn | 2 : 


: f rn 7 
ai bel eas ap eangde side ie 
, 756290 > ALss Wate tek de gut aE 
SS-HKO5 4 pcan ar Yriite 
ae fey ashe eo ine Dis 


- 34 - 


This value shows that, while driving knowledge makes a sub- 
stantial contribution to scores on the Driving Situations Test, 
the Driving Situations Test scores nevertheless represent what 
could be called a valid and independent measure of driver 
risk-taking attitudes. 


Since the Driving Knowledge Test and the Driving Situations 
Test measure independent attributes, to a certain extent at least, 
scores on both tests should provide better discrimination among 
the driver groups than scores on either test alone. In the 
present case however, combining scores on both tests adds very 
little to the discrimination that can be obtained by using the 


knowledge scores alone. 


The discrimination that can be obtained by using two 

tests instead of one test is a maximum when both tests have the 
Same validity values, and when the correlation between the two 
tests is zero. In the present case, the knowledge test has a 
much higher validity than the risk perception test, and in addit- 
ion, there is a moderate correlation between scores on the two 
tests. As a result, using the scores on the Driving Situations 
Test in addition to scores on the Driving Knowledge Test adds 
only marginally to the discriminiation that can be obtained by 


using the Driving Knowledge Test alone. 


We hasten to add however, this does not invalidate using 
the Driving Situations Test to measure risk-taking tendencies. 
What we have said about using both tests together is relevant 
only if the sole purpose of using the tests is to discriminate 


between professionals and students. 


or bot ae ae Sols: =i! 50 “adoapas aos acs % 


nan sities paiva 


z= anderad sit? mo 


sata bot, 1 fore et 


—— : “ 
' : 5 
qs : : > { 7 7 ; 
Te sins. acer Sttel eonk BADY: 1? eet wane, - Fy 
Saati 7 . 

5 7385 g Oo, “< aay sv seh! are sh a7 4hes a) 
‘ sad as : 7 2 
aoe | 450d cebivaye bd Jods ietees ‘ire nie 

‘ i is 


; ron eae ' eye pees quo +e 


a =] ie 
r ae "¢ s ca 
at AsQeNS esis prin ie Ae = guns eu: a em 
toetsito.sd mao Fer? Aetjen ems Joan? aha od 
uaa ls, SAr9g9e 
\ 
J ‘ 
shiesdy ed nes sens neleene” Pare 
CIS ‘~ Moumonem 6 24 coy ot 20 be 
xalogyan ede nenw bre 46 ilev ¥2ES 
wihaAlworny | eng, 9ee+ 4u9Re79 oI! F .o128 
3 wol 7 yan FSq hed J Pad ia. 3 psiGae 1 
i a oO: rao5ie i7 S37 2252 mn. £ a 
as ed: 03% 
a Mf al 7c ¢ , d < & j 3 DT ——s tifehes ah Pa 
elwo moaieiga Sit) to aesook oF neisibbe Gh 2m 
: : 7 3G p bes 7 TAs a¢ a id ~ .| ora c2- Viioutere® _ 
sryis 2avt eGbalsonk pagyi7 ats ease 
4 7 
; ' 
clqh tnd aeobiett) (ss on “bbe Sv bed aa! su 
=“—SR.E.0 -4Sh* saenan GJ tae; ppbsausee bicinee 
5-480 2007, Etee? idm’ pols = hate tlw oviel an 


’ ) serngbuse irl pipoanesant 0h 


a 
fs 


7 


7 


mle 


TWO ADMINISTRATIONS OF THE TESTS 


A sample of students took the tests both at the beginning 
and at the end of their driver education course; although only 
end-of-course scores were used for test validation. The purpose 
of the before and after administrations was to explore each 
test's sensitivity to changes in knowledge or risk perception. 
Whether or not the tests would or would not detect such changes 
waS not part of the validation because there was no a priori 
reason to believe that they should. 


In the case of the Driving Knowledge Test, half of the 
Students took Form A of the test and the other half took Form B 
of this test at the beginning of their course. At the end of 
the course, all of the students took both forms of the test. 
Overall, the post-course scores were about five percent higher 
than the pre-course scores. This difference was statistically 
Significant. As information relevant to considering whether 
Or not this difference is a substantial one, we will mention 
that the students at the beginning of their course had either 
just taken or were about to take a knowledge test as part of 


the requirements for obtaining a learners' licence. 


The Driving Knowledge Test was administered before and 
after the course only for the experimental administration of 
the test. Students in the crossvalidation administration took 
the test only at the end of the course. Since the post-course 
scores of the crossvalidation administration were the same as 
the post-course scores of the experimental administration, 
it seems reasonable to suggest that the increase in the exper- 
imental group's post-course scores had something to do with 
the course itself rather than just prior experience with the 
test. 


es Srubes 2 “te 


190 sald, ae? 


‘ +2 le a bs 
,; ic eat Jsorreshee “Je na mangos 
ad 


rey ie 


. 434 92 ov 250-3 tbs 36h: Te 10% 
~~eaty #O14 ww) oy ns o2. % Zvae a0 
Gg %al ic See sions 7! Re aren eS. es 


j y 3, “4 = a 
i. tink ogee. 7a ‘Ee w ie alaiew, Bary oat: see 1 
oles ¢ Of, Gee ave eatin ati it bent! thaiv ane pe 
tan eal SAS 4 ge i . a 
r styled te : | = f- BIS gy atit sel ae 
4 vai 
7 oon 
» « - ' vw ae 
ot ry a a Ka 7 as Vi - a @ ae + e3cit 
aiia 2 Da pa ‘ ~ ‘ 2 — {OVA , : vesG Si Go w 
En 5 . genitoc agi St 269e iy S5-A mso4 Ae 
—— ' - es * Le 
: = ot 
n ia SH lat Heo geooy 326) sarensy* ie In 
as ps, > ancy ne ag ie Fal se | 4 7 ; { . 
. 7 
i a avis PP 7 ans J aa” ove .< f ad” oo 
Lan’4oiceda She spretaagee TENF, | .eaaves < he eg nal 
iy oAraees 5 ane ne votes vets Apa ‘ep one Lye a 
*j 2 : < 
- | Jape. anid fEtsasianle ae ee eonaie #¥.1 eae i 
} 7 ~~ 
i Yuc> xbets Ie prt: 5. ara iee ons 3 
p *on9 gob bebnas’ 87 nodes 
ascent ‘sens. i 


Ik 2I9DeP nega egeiaa Y v2 


if oe aL i= ‘one 
“ ' . cé 23-nim Da anh babe soni 


a ¥ ed ad 
yaa - ‘7 con aff gi) be - =e 
\ 
Le ames ete Saew 
xt + : 
y! Ls i et ith ed 
' AS va 
ae — o & ~~ « al as 
anne + o e — 
i31* Ls ez ie! 42s 


. es. ia2™ Sone 7egxs 


i 
: Ta 


=x Sip = 


The students wrote the Driving Situations test before 
and after their course on both the experimental and cross- 
validation administrations of the test. In the case of the 
experimental administration, there was a small statistically 
Significant improvement in scores going from the pre-course 
to the post-course scores. The improvement was not confirmed 
by the crossvalidation results; scores at the beginning and 
end of the course were almost exactly the same. In considering 
these results however, it should be kept in mind that the current 
course was not specifically designed to reduce risk-taking 
tendencies in the sense that they are defined and measured by 


the Driving Situations Test. 


rere 
= totec eesti 
Pre f 
“2: 2% a es pre aniseas 


42 te 265 ort oa aa 
jlLieptoai tase tts abs - . liar 
623poo-e3q of9 wad ek ss¥obe as jram eT 

7 for 26 ; mets sat ny ve sat -# sicoe yet ‘ 
ban eninnsded. oe -de. es1008 fas Spee nok +t 

ie; Ls omen Sncl’é Sis a2 GS: 


pe | | ; i 
sig 2eno, Gem aflaceh ot Biugtie 3. ,sehewor eit 


i 
ain 
7 re ri 
5 oie a. ty secuhifiany res ad - aap! es ti vy ' fra i: IVE oe 7a 7 * ‘ 
— 1 
om - ao | —e shhh _ we 
. LeIucs on ne hond Pet 37F v5e3 a ied 50n oo ano bd aan 


78s" @ tub ae ee +R ‘oores 4 { 


a 
- 

es 
a a 


.25 


DISCUSSION 


Both the Driving Knowledge and the Driving Situations 
Tests should serve adequately for evaluating a new driver 
education course. In the case of the Driving Knowledge Test, 
it covers knowledge areas that experts believe to be important 
for evaluating driver education and relevant to safe driving. 
The passing score derived from the experts' Angoff ratings 
would serve as a target for graduates of the new course to 
aim for. The target is also consistent with the norms for the 
test in that 98% of the professionals met the experts' passing 
criterion, while only a few of the drivers in the other groups 


met it. 


From the norms for the Driving Knowledge Test there is 
almost no overlap between the scores of the students and profes- 
Sionals. Thus, if graduates of a new driver education course 
showed substantial improvements in knowledge, there is more 
than adequate separation between the present scores of students 
and the scores of professionals for these improvements to be 


apparent. 


The scores of students and professionals are also closely 
grouped about their respective group averages. This means that 
the test would be quite sensitive to even small improvements 
in scores of graduates from a new course. For example, if 100 
graduates of a new course scored just one score point higher 
than the present students, this improvement would be statistically 
Significant. Whether or not an improvement of one point would 
be considered a material improvement is another matter. In 
any case, there is little danger that the test would fail to 


detect material improvements for lack of statistical significance. 


Compared with the previous Transport Canada test, the 


new knowledge test has higher validity and slightly higher 


e 
tz 


ic 


7 


ye 
: 


7 


is OY 4/i 
7 = 
7S | ~~ 7 
@(ar she > 
~~ gi 
F f 
= i 
= ae, 
~~ = ~ 
ee 
— > 
c FQN 
52st 
+ 2] a 
- iis * 
i Pas 
- 
~/ «2 ol 
’. 
a is 
ae | 
Bf ot 
10 204 
7 
A 
~: f= ry 
4 eGo 
iow JEet 


rat 
priest! sey 


ori sina Ait 5 
95 ate v an 
? = 
sD xd ae Lhaabes 
sdeveley AAs LA SAe 


\ oth Letom: SF aa 
my 2s 8 Si pomd ere 
_? ; an 
7 
: £\ } » See 
ia caqooe 202 
ine wee & FC 
fwonr "hs 3am 
1 fae 
a“ 4 is’? Pend 
yg one eS 
SY p nea. He 
ay ca 1 ovp~5 4.8 
PBEVUO> went 
ana. 2 3 tee 2ttee 
a 
{ey yet ; seh 
(ahve nercait MS 3! 


ee he aha °C 


fe beanie 


a 
‘ ’ 4 
aa > be he en 


Por en ts Cie i 


ant: ae it nen 


ays foe bed" s 


et to et é neko es =2i0i 0 


wruc. qalasTo 


aro ec 


reliability. As a matter of interest, average percentage scores 
of professionals were almost exactly the same on both tests; 
about 80%. Students on the other hand, scored almost 15% lower 
on the new test than students scored on the Transport Canada 
test. This difference should not be interpreted as a sub- 
stantive difference in knowledge between the two groups of 
Students. Without norms for the same students taking both tests, 
there is no way of telling whether this difference is due to 
differences in test construction or due to differences in 


knowledge. 


The Driving Situations Test, while being quite reliable, 
does not have the degree of validity that its knowledge counter- 
part does. On the other hand, its validity is comparable with 
validities typically found for attitude tests of this sort. 

In addition, things that can be said about the knowledge test's 
ability to detect and measure improvements in scores for 
graduates of a new course can also be said in principle for the 


Driving Situations Test. 


While the Driving Situations Test can adequately serve 
evaluation purposes, its real value may lie in showing that 
risky decision making in driving can be treated using well 
established techniques for studying human decision making in 
other areas. For example, the test represents an application 
of Conjoint Measurement Theory which is widely used in marketing 
to study consumer decision making. In a broader context, the 
Driving Situations Test might bridge the gap between the con- 
Siderable body of knowledge that has been developed about 
risky decision making in other areas, and our relatively un- 


sophisticated knowledge about risky decision making in driving. 


Gaining a better understanding of why there is such a 
weak, sometimes non-existant, relationship between accidents and 


driving abilities may be an area where the approach taken in 


9 
~ ¥m 
id 

CG 


. 


$3 
ie 
ers 


- a 7 [ 
jie 4 ae tod ovqeaini sd eb _Btus s sons 
; - 
I> 28nGW ows sidocnesw yi aube wana 
nod oO LNES 2 oth be) tees one caita 4 oe. am 


siib e) sodevedd ib exis senipadw't phkitan, ¥¢ 


' 


ni wo: tes ae i LLRs = At 3 i. gs rel anges + 


> * ne . an 
s *phelwonet alt J6n2 cate) 9 vee haa be 
B 


teh ee - > 
éisawod, <i yaibiley 222 » UE Jedsu-ots 
. ‘ aim on = 
2 S rh oC eve _ Verh a ye ie > es bvisey ‘= 
xi 
eT: iword ed% tues bpse 4 wer 
Bo SjJOy. Ff f HmpAvoTan. GAs 
J 
5 elareans s@ nmi Biee 50 F215 RES 
Lrore ; . 
' 
f ipabe gen SaeT 27 
cas. ongenne af 21] vam seftev Lee- 
? ~ > 4 Te 
¥ a Pa 
tie: onter berse7d 86 neo pHives™ 
. “<2 ) - t 
onwtem debetves Semod posysuse oe 
Ch. 14 tageciset fees |8f3 
: Sedu vliebiv.ec coi nw YIosnT . 
*xe3009. 74> 1 & RS pat 6m irs 
- , ok? ea 
> o> Reelred.. qed: Bly op! ts fe i its 
Lid i 
nt - — j oh 
jorin Abeuss 3 ‘Rgen: be!).2 gant aph 
: 4 
' Vie gles 
: * 
rn el erase. pores 
Rote i s3eans 


ab) 98 nei ey 9 


ane bet Hi Baia let? ll 3 


= Bo) & 


the Driving Situations Test would be useful. It is becoming 

more and more evident that producing knowledgeable and skilled 
drivers may be a necessary condition for reducing accidents, but 
it is by no means a sufficient condition. This is the importance 
of the finding that the Driving Situations Test measures risk-taking 
independent of driving knowledge. Indeed, it is a finding that 
confirms any driver's own experience; year to year, day to day, 
and even minute to minute fluctuations in one's value systems 
have a significant impact on decisions about whether or not 

to accept a particular driving risk. Applying the established 
methods represented in a test like the Driving Situations Test 
might help to better understand how these fluctuations work, 


and how drivers could be helped in controlling them. 


As mentioned in describing the relationship between 
scores on the Driving Situations Test and scores on the Driving 
Knowledge Test; the Driving Situations Test seemed to measure 
a quality that was independent of knowledge, but the indep- 
endence was not strong. To find out whether or not this 
weakness of independence is general would require more study. 
Both tests covered much the same content; so it is not surprising 
that responses to both tests were correlated with one another. 
In addition, the Reason-for-Driving factor was kept relatively 
constant so that risk ratings would be more a reflection of the 
driving hazards in the scenarios than of drivers' value systems. 
If we constructed a test that focused more directly on drivers’ 
perceived payoffs, we might get results that are less dependent 
on driving knowledge and at the same time more illuminating 


about drivers! risk-taking. 


To remark on the results of administering the tests 
at both the beginning and end of the driver education course: 
the results for the knowledge test suggested that the present 
course produces at least a measurable increase in driving kKnow- 


ledge as defined by the experts. How this increase should be 


hk a f 
h Ape-P eas he 


) od gry Py - 


7. « 
is 
= eit) 
“TWD } 
f a Pe J 
7 _ -~ @€ 
' > 
‘ ol ol 
> a , 
4 im | + 
+ > «a 
e rc 
7 i] 
- ® 


Pe a! gl ars nt enstzepsaes8 ason: im of om 
4 Ban 


a 
4 re piri. 4) bys? > eit fest i a. betas 
IaUID if octet “Om ite feast = ve 
baaliorwrAce baghe« 86 Bares evs p>; « 
; / — 
\ Ph ee A la bu! 
ero! TELS) oe Sry sFhtlbels cai: tapieten 2 won “A 
SI008) ting! geet) BNGL ens LE post rae eae we 
Sno< eI Bst03 94; vy uty 2 “a9 ia 
= * = ~ oe 
Ss ,Specelwon io Srehe wee rant “ey rae [3 
Prog) 1 | 1er atte, IRL baie c ° ney 
aThitaeat it fejanae e< Staryigt 
s+ @ F — oy ws he ein 
4 . 5 } Bris 19 ioe 
j mai > , i= °e% ays >7 aw se = Fal 
2Q2e. 2a 2A; pricy ¥G-147< $B 
a = ll ; os. 7 
i763 168 aii wetse8 4) 
[a 7 > 
i] | * Ye =f = ite y ~~ q 
ae | Las = as ee Cod one 
I=¢ 7 SVC O9ei/orn> SAnte = 


oe enolase oe $8 Aa Late hubs 
. 2 5 a —~e~ = , -- 
ine: ats Atay elves saludo sam — 92 


ne wm ows ores 


x 
9n5 » . 


evaluated, we leave to others to decide. 


There was a small improvement in the driving situations 
scores during the course, but the improvement was not confirmed 
by the crossvalidation results. The present driver education 
course was not designed with goals that would lead to expecting 
these scores to improve, so it should not be surprising that 
they did not. Accepting risk is a trade-off between probab- 
ilities and payoffs. If students are as influenced by payoffs 
related to social expectations and needs to develop self-esteem, 
as all the evidence about teenage development suggests, it may 
by difficult to design a driver education course that attempts 
to remove these fundamental influences from the driving scene. 
We do not intend these remarks to suggest doubt about what a 
new course might achieve. We do suggest that any measurable 


achievement should be treated as achievement indeed. 


As a final item, we will turn to the performance of 

the nine-point drivers on each of the tests. When the nine- 
point drivers were suggested as an experimental group it was 
thought that they might provide a norm representing inadequate 
drivers; and that their performance on either test might come 
out worse than that of the students. Instead, they performed 
substantially better than the students. Indeed, if students 
from a new driver education course performed as well as the 
nine-point drivers performed, their performance would represent 


a substantial improvement over students from the present course. 


From reviewing demographic and driving experience data 
collected along with the test administrations, it appeared that 
the nine-point drivers might not be so much inadequate as they 


are overexposed. The nine-point drivers reported above average 


maiviah 


pity vies 
sid ae mtr sobs aes festoeae beaos 


couaoR ad rid Nb deta 


7Ot es bation Bs Late) alba me of aeaws 


@s 236. Boi ed sd a6, = ys orate Oe, 
fj pie a 
ween ear Brie site 2 ini iT oF paedie 

, ° 


bois gum. BagEOD — m4 
Pe Salve Ame sat note rn 
ee 


ater KS as iets emis ger: a aout -y Y 


7 Ay 


leveb aps seo? sue ics stake as 


sey, 
nae pd ¥ ‘ime si “. 


«. ‘ » i Seek 
SRS navi vt & a34 


a of! tinh Saruseente? oo ft. penn 


neeane of SW avaicdos JHaLK aa 


gules “se Setaes> oc Of note Jase 7 
4 of aict Iiiw ew 1 PME S S'S 

¥ oc Ae no faves on! oq ~eintet 
LeacKS of. Saves oe apew : vib 2 

{ on >% a 39 om autte ae 1% 

4 70 Sénemintse ay) sete fine 48 


270s ou 2 sid el ary axils wari 


sspostioe etd acd sous 4tis aed 
4s se VeOD notedoloe tougnb wan 8 Hox” 
tsa xed? .Gemvotr med erat gnk a 
ogo, pt cones 


4 a) 


arte Tre to Pye nae eh pi datos. oo oon 


ma TA ie 


annual distances driven; and they included more commercial vehicle 
drivers than would be found in a sample from the general pop- 
ulation. Thus, some proportion of their accumulated demerit 
points can be attributed. to greater exposure. However, after 
correcting for distances driven, the nine-point drivers still 

had an accident rate twice that of the professionals. On the 
other hand, the accident rate of the nine-point drivers was still 
lower than the accident rates that would be predicted for the 
Students in two or three years following graduation from driver 
education. Indeed, one of the students in this study had already 
been involved in a fatal accident before graduation. While we 
have no way of knowing how the test scores of the nine-point 
drivers would compare with drivers from the general public, 

the demographic and driving experience data suggests that the 
nine-point drivers might not have been too unrepresentative of 


the general public. 


: ‘ ” : , = ‘ - : 
di = 7AEMOo ae on Leet 
res ‘os - 5 aK —> pg «y 


raoreb heelumh2oe a 
‘ ~~ 4 
Pe j sf Sawt * 
’ . 7in/anieg- 
‘ + ee 7 
"y hie 3 * pew Sh = “ Jig ; 
~~ te = ie - 7 _— , ra ; 
—<— ee | Fee ae! bLloe 7 ake ~ i maxes ‘7 re cet wie ef Mia > 
we a ela : i * r Y a fa, Ae 
oS’ nh Ar > ‘aneov nm ear 30" oN me i 
b> be 7 oe (ve a S40 Han nn vis ve |" bast 
uy ; : | © 0 => Grr eo ts 
te =o ar efeabudz sis oe gig- 4 - rie: 
- = 4 » * = vgs ; P # 
yal Abd 4éebtebe beves 6 OF i dead 
fa - = 1s ee 
io 8 : ae 4 ee ead wot petit “oe * x — 
a ern ets 2o a9 IS2e) 45-255 ¥ ” 
i. ol u a ‘$5 a . 
as meg: 2 IBVERD abe ST EgRED, sensi 
5 . i J > ue - 4. i - < x 
“ , efiaseay Unk’ + $2 one Sie 
; oj m ron), Li ¢ = - = xe r : A 
a Fe ; ar « 
4 ie te ? feel: =+J60 @en aap «oy ish ante 
a | | helical : 
ns 
7 ee 
£ 
i] 


=a 


REFERENCES 


1) Engel, G.R. & Townsend, M. Examiner's Manual: Driving 
Knowledge Test. September, 1982. 


2) Engel, G.R. & Townsend, M. Examiner's Manual: Driving 
Situations Test. September, 1982. 


3) Bragg, B.W.E. Revision and Evaluation of Driver Education 
in Ontario: Phase 1: Development of an Evaluation Plan. 
Ministry of Transportation and Communications: Toronto, 
Ontario: September, 1980. 


4) RODELL SON MATS ape KING AnWd. O.n Drate, De. & Murdoch,P.A. 
Revision and Evaluation of Driver Education in Ontario: 
Phase 2: Preparation of a Curriculum Development Plan. 
Ministry of Transportation and Communications: Toronto, 
Ontario: September, 1980. 


5) Clifford, L.V. & Deslauriers, B.C. Revision and Evaluation 
of Driver Education in Ontario: Summary of the First Two 
Phases of the Study. Ontario Ministry of Transportation 
and Communications: Toronto, Ontario: October, 1980. 


6) Engel, G.R., Paskaruk, S. & Green, N. Driver Education 
Evaluation Tests. Department of Transport. March, 1978. 


7) Portlock; W.l.) and McDole, T.C., Handbook for Driving 
Knowledge Testing. Highway Research Institute, The 
University of Michigan, Ann Arbor, Michigan: August, 1974 
Report No. HSRI - 001590-3. 


8) Angoff, W.H. Scales, norms, and equivalent scores. In 
R.L. Thorndike (Ed) Educational Measurement. Washington, 
Dec. meAMe Gi Cal GOunCc! mone baucatiton 19 7179514—515% 


9) Coombs, C.H., Dawes, R.M. and Tversky, A. Mathematical 


Psychology: An Elementary Introduction, Prentice-Hall Inc. 
New Jersey: 1970. 


10) Winer, B.J. Statistical Principles in Experimental Desian. 
Second Edition, McGraw Hill, New York: 1973 


1) Green, Paul On. the Design of Choice Experiments Involving 
Alternatives. The Journal of Consumer Research, ak Seine 
Lo74i, app ee OL- 60% 


~ 


eri A ,veetavt fre Mh ,soe57 or ,omere 


a ; THIG’ WEY vLese 


i a ale " 
[one ea : 2hiG2 a ‘4 onsadws e.. 


EEL yeadmetass «dike 


rene Asta iM hae. ariwo Z a ued 


in a ois 
ov a - saared j " . 1 
eye Re hodneigs2 "4patoane 
i a4 
Me sop! aut bas ft = this ae 77) vit tt 
oon a y" oe “} ra) F i (eat hg fds sah oe s j oy : 
: — ~~ _ > , a lle es : —Tt 
fan wtied off Qos see seq ct "YO Vase ee 
DEEL , “beim 9 joe rob te aes 
QO ,44eeo \-O AR Bee 
~ \ +c) rep! at O2y re v \ 
we } Tw 4m Lt igen ee Naa " . 
a eur’ re j ' rt ® ae 
: + ee WY 4 we « S Loe 
eile ele 
r he dee t + T 
. tube F x Wed Ge a 4 ‘. 
-,* 
‘a JOT. . 34 
\ ) 
.D.& . dae 2otfelaadc Ved 
a. a 
r v7 - -#) aN j pmaari) 6 iF bs Fe! a9 
j LJ tt ' iin merry eG me 
y - “A ~ e ‘ee a 
ores. ,.cracio: er 
a 
ein ms ic’ wi Su .ete2e 
spy > tonans seqsc Prue 
- 7 = a ~ ey 
“ee : - 4 ou 45M, ene = 
we swim Dis polkas 
> Farsi x £@. 
ser pA eA Wirt ee yok | 43 
»EaveekO0 = Maee 
r kK: 
{” 4 nition 4c eo) ae t-d 34 > wal 
e 
; fusmyin (hs) SkA Gren ROG 
- - ~w - ~ a , r 
wT wee fe Lreatioo teckiomh +.’ ie 


“dzzuborgal vaseoee 8 
pe 


, ® = P : - 
ot pefebuplwS teviseigage .%.6 

smumrted tags qm, at bl eee le a 
, 


lait agaedo da npiaeteai oO tee? «7 
a) weepoeo de is i) al. et. ; gavisemie 


(i 
yaad 


‘ es N Se neon a am 7 os Ba- 8 es te 


™ 


\ 


ol 


a 


— iP. So 
9 


me 4 4 


